-
OACCT – Help (draft!)
+
+
Help
+
+
How to use the tool
-
- A database search can be performed by using three search boxes (Institution, funder and journal). The search results contain the following information:
-
- General information about the selected journal with a link to the journal’s publication conditions
- APC discount or information regarding a specific deal for Gold OA
- Information about Green OA conditions (source: Sherpa/Romeo)
- QOAM score
- To be completed
-
+
+ A database search can be performed by using three search boxes (Institution, funder and journal). Once you click on "Check", the results are displayed below the search fields.
+
+
Selected options
+
+
This field contains a brief summary of your selected options (institution, funder and/or journal) in the form of cards.
+
+
+ Swiss institution
+
+ Name
+ Founding year
+ Website
+ Link to the institutional repository (if available)
+ < /ul>
+
+ Funder
+
+ Name
+ Country
+ Website
+
+
+ Journal
+
+ Title
+ ISSN
+ Link to the journal or publisher's website
+ Open Access status
+ Language(s)
+ DOAJ/LOCKSS/PORTICO information
+ QOAM score
+
+
-
-
- API info to be added (maybe on a different page)
-
- Data reuse & licence
-
- Please see our terms of use to obtain information about how the data provided by this service may be reused.
-
+ Search results
+
+ The search results provide an overview over the costs and most benefits from publishing in a given journal
+ or by making use of a publishing agreement. They come in the form of institutions’ OA policies,
+ journals’ publishing policies and publishing agreements and are grouped by the version
+ of a publication they concern (submitted/preprint, accepted/postprint or published/final).
+ The type and amount of search results depend on the choice of the search fields:
+ Choosing both an institution and a journal will show the institution’s OA policy alongside
+ the journal’s publishing policy condition (and possibly existing publishing agreements),
+ which facilitates comparing them.
+ Visually, the condition sets are represented by cards as well and contain the following information:
+
+
+
+ Type of condition set (institutions’ OA policies, journals' publishing policies and publishing agreements)
+ A set of four conditions, consisting of
+
+ Cost factors: Absolute publishing cost (APC), APC discount or refund
+ The licence under which the publication appears
+ Indicator whether it is allowed to archive the publication in the authors' institutional repositories
+ Embargo period
+
+
+ Additional conditions defined by a publisher, funder or institution (free text)
+ The term card reference number (Cxxxx/Tyyyy)
+ A "Modification request" button that allows notifying the platform's administrators of errors in the corresponding term card.
+
+
+
+
API access
+
+
+
+
+
+
+
+
+ The content of this web site and the data provided through the API are distributed under the
+
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
+
+
)
}
diff --git a/assets/src/pages/Noresult.js b/assets/src/pages/Noresult.js
index 6c77effe..109239b5 100644
--- a/assets/src/pages/Noresult.js
+++ b/assets/src/pages/Noresult.js
@@ -1,24 +1,41 @@
+/*
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+*/
+
import React from "react"
import "./noresult.css"
export default function Noresult () {
return (
)
}
diff --git a/assets/src/pages/SearchFilterFields.css b/assets/src/pages/SearchFilterFields.css
index 8eb8360c..bcb46368 100644
--- a/assets/src/pages/SearchFilterFields.css
+++ b/assets/src/pages/SearchFilterFields.css
@@ -1,49 +1,53 @@
.form-input {
margin-bottom: 1rem !important;
}
- .App-btn {
+
+.App-btn {
background-color: #3771C8 !important;
color: white !important;
width: 99% ;
}
+.field-comment {
+ font-size: .75em
+}
.App-btn:hover {
background-color: #D40000;
}
@media only screen and (min-width: 768px) {
.form-input {
margin-right: 1rem !important;
}
.App-btn {
width: 99%;
background-color: #3771C8 !important;
color: white !important;
}
.App-btn:hover {
background-color: #D40000 !important;
}
}
@media only screen and (min-width: 1024px) {
.form-input {
margin-right: 1rem !important;
}
.App-btn {
width: 99%;
background-color: #3771C8 ;
color: white;
bottom: -5px;
}
.App-btn:hover {
background-color: #D40000;
}
}
\ No newline at end of file
diff --git a/assets/src/pages/SearchFilterFields.js b/assets/src/pages/SearchFilterFields.js
index 121b0b2d..3f35b7a9 100644
--- a/assets/src/pages/SearchFilterFields.js
+++ b/assets/src/pages/SearchFilterFields.js
@@ -1,1227 +1,1276 @@
+/*
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+*/
+
import React, {useContext, useState, useEffect} from 'react';
import "./SearchFilterFields.css"
import { makeStyles } from '@material-ui/core/styles';
import Button from '@material-ui/core/Button';
import FormControl from '@material-ui/core/FormControl';
import TextField from '@material-ui/core/TextField';
import Autocomplete from '@material-ui/lab/Autocomplete';
import { searchCondi, searchorganizationonly, searchjournalonly, searchInstitFunder, searchCondi3 } from '../services/requests/Condition'
import {getJournal} from '../services/requests/Journal'
import {getFunder} from '../services/requests/Funder'
import {getInstitution} from '../services/requests/Institution'
import Accordion from '@material-ui/core/Accordion';
import AccordionSummary from '@material-ui/core/AccordionSummary';
import ExpandMoreIcon from '@material-ui/icons/ExpandMore';
import Typography from '@material-ui/core/Typography';
import AccordionDetails from '@material-ui/core/AccordionDetails';
import Grid from '@material-ui/core/Grid'
import Box from '@material-ui/core/Box'
import Container from '@material-ui/core/Container';
import {Context} from "../ContextProvider"
import ResultCard from "../components/ResultCard"
import DetailCard from "../components/DetailCard"
import CircularProgress from '@material-ui/core/CircularProgress'
import Fab from '@material-ui/core/Fab'
import ShareIcon from '@material-ui/icons/Share'
import Dialog from '@material-ui/core/Dialog';
import DialogActions from '@material-ui/core/DialogActions';
import DialogContent from '@material-ui/core/DialogContent';
import DialogContentText from '@material-ui/core/DialogContentText';
import DialogTitle from '@material-ui/core/DialogTitle';
import Slide from '@material-ui/core/Slide';
import Welcome from './welcome';
import {
useHistory,useLocation
} from "react-router-dom";
import PropTypes from 'prop-types';
// import { FiFlag } from 'react-icons/fi';
import FlagOutlinedIcon from '@material-ui/icons/FlagOutlined';
import Tooltip from '@material-ui/core/Tooltip';
function useQuery() {
return new URLSearchParams(useLocation().search);
}
const Transition = React.forwardRef(function Transition(props, ref) {
return
;
});
// ID of condition type that must be excluded in some API requests
const j_only_id = 3
const o_only_id = 1
const useStyles = makeStyles((theme) => ({
root: {
flexGrow: 1,
},
chip: {
margin: 0.5,
},
}));
/**
- * Contain the main logic of OACCT tools to filter and send the appropriate request.
+ * Contain the main logic of OACT tools to filter and send the appropriate request.
* @version 0.0.1
* @author [Hugo Galuppo](https://github.com/hgpulse)
*/
export default function SearchFilterFields() {
/** Access to URL parameter */
const history = useHistory();
console.log(history)
let query = useQuery()
//state that allow to hide or show the share url button
const [open, setOpen] = React.useState(false)
const classes = useStyles();
//call the custom hook to share the state between different level componant
const { getSelectedInstitId,
getSelectedJournalId,
getSelectedFunderId,
institList,
journalList,
funderList,
institId,
journalId,
funderId,
setInstitId,
setJournalId,
setFunderId,
setUrl,
url
}
= useContext(Context)
//responses
const [conditions, setConditions] = useState([]);
const [details, setDetails] = useState([]);
const [result, updateResult] = useState([]);
//Manage the loading state to hide or show the spinner in the search bar
const [loading, setLoading] = useState(false);
// const [url, setUrl] = useState(window.location.href);
useEffect(() => {
setDetails('null')
setUrl(window.location.href)
//handle Url param
console.log(history)
if (history.location.pathname === "/check") {
console.log("this an url to check")
setDetails('fromUrl')
// alert(query.get("institution"))
if (query.get("institution") && !query.get("funder") && !query.get("journal")){
//get organizations conditions
// alert(`get api organization Condition only: ${institId}`)
//condtion type is not journal only = 1
// Get the user
const sendSearchInstitOnly =
async () => {
try {
const resp = await searchorganizationonly(query.get("institution"), j_only_id)
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
console.log(details)
const sendGetrequest =
async () => {
try {
const resp = await getInstitution(query.get("institution"))
// console.log(`instit name from api: ${resp.data.name}`)
// setInstitName(resp.data.name)
updateResult(arr => [...arr, resp.data])
// if (details === "null") {
// setDetails(resp.data)
// }
// else {
// setDetails(prevArray => [...prevArray, resp.data])
// }
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendSearchInstitOnly().then(
sendGetrequest()
)
history.push({pathname:`check`, search:`institution=${query.get("institution")}`})
}
else if (!query.get("institution") && !query.get("journal") && query.get("funder")){
//get funder conditions
// alert(`get api funder Condition only: ${funderId}`)
//condtion type is not journal only = 1
const sendSearchOrgaOnly =
async () => {
try {
const resp = await searchorganizationonly(query.get("funder"), j_only_id)
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetrequest =
async () => {
try {
const resp = await getFunder(query.get("funder"))
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendSearchOrgaOnly().then(
sendGetrequest()
)
history.push({pathname:`check`, search:`funder=${query.get("funder")}`})
}
else if (!query.get("funder") && !query.get("institution") && query.get("journal")){
//get journals conditions
// alert(`get api journal Condition only: ${journalId}`)
//condtion type is not institution only = 2
//get journal detail
const sendSearchJournalOnly =
async () => {
try {
const resp = await searchjournalonly(query.get("journal"), o_only_id)
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
// setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetrequest =
async () => {
try {
const resp = await getJournal(query.get("journal"))
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendSearchJournalOnly().then(
sendGetrequest()
)
history.push({pathname:`check`, search:`journal=${query.get("journal")}`})
}
else if (query.get("institution") && query.get("funder") && !query.get("journal")) {
//alert(`get api Filter Conditions SET--> Journal: ${journalId} VS Institution: ${institId}`)
//condtion type journal/condition = 3
const sendSearchCondi =
async () => {
try {
const resp = await searchInstitFunder(query.get("institution"), query.get("funder"), j_only_id)
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetInstit =
async () => {
try {
const resp = await getInstitution(query.get("institution"))
console.log(resp.data)
// detailArray.push(resp.data)
// setDetails(detailArray)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
const sendGetFunder =
async () => {
try {
const resp = await getFunder(query.get("funder"))
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendSearchCondi().then(
sendGetInstit().then(sendGetFunder())
)
history.push({pathname:`check`, search: `institution=${query.get("institution")}&funder=${query.get("funder")}`})
}
else if (query.get("institution") && query.get("journal") && !query.get("funder")) {
//alert(`get api Filter Conditions SET--> Journal: ${journalId} VS Institution: ${institId}`)
//condtion type journal/condition = 3
const sendSearchCondi =
async () => {
try {
const resp = await searchCondi(query.get("journal"),query.get("institution"))
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetInstit =
async () => {
try {
const resp = await getInstitution(query.get("institution"))
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
const sendGetJournal =
async () => {
try {
const resp = await getJournal(query.get("journal"))
console.log(resp.data)
// detailArray.push(resp.data)
// setDetails(detailArray)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendSearchCondi().then(
sendGetInstit().then(sendGetJournal)
)
history.push({pathname:`check`, search: `institution=${query.get("institution")}&journal=${query.get("journal")}`})
}
else if (!query.get("institution") && query.get("journal") && query.get("funder")) {
// alert(`get api Filter Conditions SET--> Journal: ${journalId} VS Institution: ${funderId}`)
//condtion type journal/institution/funder conditions = 3
const sendGetCondi =
async () => {
try {
const resp = await searchCondi(query.get("journal"),query.get("funder"))
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetFunder =
async () => {
try {
const resp = await getFunder(query.get("funder"))
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
const sendGetJournal =
async () => {
try {
const resp = await getJournal(query.get("journal"))
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendGetCondi().then(
sendGetFunder().then(
sendGetJournal()
)
)
history.push({pathname:`check`, search: `funder=${query.get("funder")}&journal=${query.get("journal")}`})
}
else if (query.get("institution") && query.get("journal") && query.get("funder")) {
// alert(`get api Filter Conditions SET--> Journal: ${journalId} VS Institution: ${funderId}`)
//condtion type journal/institution/funder conditions = 3
console.log("main check !")
//(institution + journal)
const detailArray = []
const sendGetCondi =
async () => {
try {
const resp = await searchCondi3(query.get("institution"),query.get("journal"),query.get("funder"))
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetInstit =
async () => {
try {
const resp = await getInstitution(query.get("institution"))
console.log(resp.data)
detailArray.push(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
const sendGetFunder =
async () => {
try {
const resp = await getFunder(query.get("funder"))
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
const sendGetJournal =
async () => {
try {
const resp = await getJournal(query.get("journal"))
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
//order requests
sendGetCondi()
sendGetInstit().then(
sendGetFunder()
).then(
sendGetJournal()
)
history.push({pathname:`check`, search: `institution=${query.get("institution")}&funder=${query.get("funder")}&journal=${query.get("journal")}`})
}
}
}, [])
//useEffect on Url state change
React.useEffect(() => {
//condition to avoid infinite loop
if (history.location.pathname === "/") {
setConditions([])
setDetails('null')
updateResult([])
setUrl(window.location.href)
}
}, [url]);
function handleReport () {
// ## Create mail template to report a modification, contain the actual Url and the reference Term Card
- window.open(`mailto:publishsupport@epfl.ch?subject= OACCT Modification request for ${encodeURIComponent(url)} &body=Request Description:`)
+ window.open(`mailto:publishsupport@epfl.ch?subject= OACT Modification request for ${encodeURIComponent(url)} &body=Request Description:`)
}
//copy url to clipboard
function handlShare(e) {
setOpen(true)
navigator.clipboard.writeText(url)
}
const handleClose = () => {
setOpen(false);
};
function handleInstit(e, newInputValue) {
if (newInputValue){
getSelectedInstitId(newInputValue)
return
}
// if (institName){
// getSelectedInstitId(institName)
// return
// }
setInstitId("")
}
function handleFunder(e, newInputValue) {
console.log(newInputValue)
if (newInputValue){
getSelectedFunderId(newInputValue)
return
}
setFunderId("")
}
function handleJournal(e, newInputValue) {
if (newInputValue){
getSelectedJournalId(newInputValue)
return
}
setJournalId("")
}
function handleSubmit(e) {
setLoading(true)
e.preventDefault()
//reset precedent results
setConditions([])
setDetails([])
updateResult([])
if (!institId && !journalId && !funderId){
setLoading(false)
setDetails('null')
}
if (institId && !journalId && !funderId){
//get organizations conditions
// alert(`get api organization Condition only: ${institId}`)
//condtion type is not journal only = 1
// Get the user
const sendSearchInstitOnly =
async () => {
try {
const resp = await searchorganizationonly(institId, j_only_id)
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
console.log(details)
const sendGetrequest =
async () => {
try {
const resp = await getInstitution(institId)
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
// if (details === "null") {
// setDetails(resp.data)
// }
// else {
// setDetails(prevArray => [...prevArray, resp.data])
// }
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendSearchInstitOnly().then(
sendGetrequest()
)
history.push({pathname:`check`, search:`institution=${institId}`})
}
else if (!institId && !journalId && funderId){
//get funder conditions
// alert(`get api funder Condition only: ${funderId}`)
//condtion type is not journal only = 1
const sendSearchOrgaOnly =
async () => {
try {
const resp = await searchorganizationonly(funderId, j_only_id)
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetrequest =
async () => {
try {
const resp = await getFunder(funderId)
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendSearchOrgaOnly().then(
sendGetrequest()
)
history.push({pathname:`check`, search:`funder=${funderId}`})
}
else if (!funderId && !institId && journalId){
//get journals conditions
// alert(`get api journal Condition only: ${journalId}`)
//condtion type is not institution only = 2
//get journal detail
const sendSearchJournalOnly =
async () => {
try {
const resp = await searchjournalonly(journalId, o_only_id)
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
// setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetrequest =
async () => {
try {
const resp = await getJournal(journalId)
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendSearchJournalOnly().then(
sendGetrequest()
)
history.push({pathname:`check`, search:`journal=${journalId}`})
}
else if (institId && funderId && !journalId) {
//alert(`get api Filter Conditions SET--> Journal: ${journalId} VS Institution: ${institId}`)
//condtion type journal/condition = 3
const sendSearchCondi =
async () => {
try {
const resp = await searchInstitFunder(institId, funderId, j_only_id)
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetInstit =
async () => {
try {
const resp = await getInstitution(institId)
console.log(resp.data)
//manage the order output
// detailArray.push(resp.data)
// setDetails(detailArray)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
const sendGetFunder =
async () => {
try {
const resp = await getFunder(funderId)
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendSearchCondi().then(
sendGetInstit().then(sendGetFunder())
)
history.push({pathname:`check`, search: `institution=${institId}&funder=${funderId}`})
}
else if (institId && journalId && !funderId) {
//alert(`get api Filter Conditions SET--> Journal: ${journalId} VS Institution: ${institId}`)
//condtion type journal/condition = 3
const sendSearchCondi =
async () => {
try {
const resp = await searchCondi(journalId,institId)
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetInstit =
async () => {
try {
const resp = await getInstitution(institId)
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
const sendGetJournal =
async () => {
try {
const resp = await getJournal(journalId)
console.log(resp.data)
// detailArray.push(resp.data)
// setDetails(detailArray)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendSearchCondi().then(
sendGetInstit().then(sendGetJournal)
)
history.push({pathname:`check`, search: `institution=${institId}&journal=${journalId}`})
}
else if (!institId && journalId && funderId) {
// alert(`get api Filter Conditions SET--> Journal: ${journalId} VS Institution: ${funderId}`)
//condtion type journal/institution/funder conditions = 3
const sendGetCondi =
async () => {
try {
const resp = await searchCondi(journalId,funderId)
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetFunder =
async () => {
try {
const resp = await getFunder(funderId)
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
const sendGetJournal =
async () => {
try {
const resp = await getJournal(journalId)
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
sendGetCondi().then(
sendGetJournal().then(
sendGetFunder()
)
)
history.push({pathname:`check`, search: `funder=${funderId}&journal=${journalId}`})
}
else if (institId && journalId && funderId) {
// alert(`get api Filter Conditions SET--> Journal: ${journalId} VS Institution: ${funderId}`)
//condtion type journal/institution/funder conditions = 3
console.log("main check !")
//(institution + journal)
const detailArray = []
const sendGetCondi =
async () => {
try {
const resp = await searchCondi3(institId,journalId,funderId)
console.log(resp.data)
setConditions(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
setLoading(false)
}
const sendGetInstit =
async () => {
try {
const resp = await getInstitution(institId)
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
const sendGetFunder =
async () => {
try {
const resp = await getFunder(funderId)
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
const sendGetJournal =
async () => {
try {
const resp = await getJournal(journalId)
console.log(resp.data)
updateResult(arr => [...arr, resp.data])
} catch (err) {
// Handle Error Here
console.error(err);
}
}
//order the request
sendGetCondi()
sendGetInstit().then(
sendGetFunder()
).then(
sendGetJournal()
)
history.push({pathname:`check`, search: `institution=${institId}&funder=${funderId}&journal=${journalId}`})
}
}
console.log(`all conditions SET: ${conditions}`)
console.log(details)
console.log(`Selected Institution ID: ${institId} , Selected Funder: ${funderId}, Selected Journal ID: ${journalId}`)
function detailsResult() {
console.log(`details: ${details}`)
console.log(result)
if (details !== 'null') {
return (
}
aria-controls="panel1a-content"
id="panel1a-header"
>
Selected option(s)
{result?.map(i => (
))}
)
}
}
function conditionResults () {
return (
{conditions?.map(i=> (
))}
)
}
return (
{detailsResult()}
{conditionResults()}
{ history.location.pathname === "/" &&
}
{/* { history.location.pathname === "/" && */}
{/* } */}
{"Share your Result!"}
{url}
Copy to clipboard!
);
}
SearchFilterFields.propTypes = {
/** Store the selected option/field Result from API. */
details: PropTypes.object,
/** Store the individual response for each request. */
result: PropTypes.object,
/** Store at the same place the aggregation of all request result */
conditions: PropTypes.object,
/** Manage the loading wheels inside the check button. */
loading: PropTypes.bool
}
diff --git a/assets/src/pages/about.css b/assets/src/pages/about.css
index 3ef60b46..4c41ad7a 100644
--- a/assets/src/pages/about.css
+++ b/assets/src/pages/about.css
@@ -1,12 +1,17 @@
.main {
margin: 3rem !important;
}
+.div {
+ margin: 5rem;
+ text-align: left;
+ }
+
h1 h2 {
margin-bottom: 3rem;
+ align: center;
}
-
li {
text-align: left;
}
diff --git a/assets/src/pages/help.css b/assets/src/pages/help.css
index 5421bcbb..853683ed 100644
--- a/assets/src/pages/help.css
+++ b/assets/src/pages/help.css
@@ -1,12 +1,24 @@
-
- .div{
+.div {
margin: 5rem;
+ text-align: left;
}
- .list {
+
+.list {
margin-left: 3rem;
margin-top: 3rem;
}
- .list-center {
+
+.list-center {
text-align: center;
margin-top: 3rem;
- }
\ No newline at end of file
+ }
+
+h1 h2 {
+ margin-bottom: 3rem;
+ align: center;
+}
+
+img {
+ float: left;
+ margin-right: 10px;
+}
\ No newline at end of file
diff --git a/assets/src/pages/welcome.css b/assets/src/pages/welcome.css
index 5dbbd5f9..3af7b231 100644
--- a/assets/src/pages/welcome.css
+++ b/assets/src/pages/welcome.css
@@ -1,82 +1,82 @@
.div{
margin-block-end: 5rem;
}
.flex-container{
display: flex;
height: 60rem; /* Or whatever */
flex-direction: column;
}
.flex-item {
/* border-style: solid; */
margin: 1rem;
padding: 2rem;
- cursor: pointer;
+ /* cursor: pointer; */
display: block;
background: whitesmoke;
box-shadow: 0 2px 48px 0 rgba(0, 0, 0, 0.10);
-webkit-border-radius: 20px;
-moz-border-radius: 20px;
border-radius: 20px;
padding: 30px;
text-align: center;
-webkit-transition: all 0.3s ease 0s;
-moz-transition: all 0.3s ease 0s;
-o-transition: all 0.3s ease 0s;
transition: all 0.3s ease 0s;
position: relative;
margin-bottom: 30px;
}
/* .flex-item:hover {
background-color: #3771C8 ;
} */
h2{
font-family: 'Quicksand', sans-serif;
}
p {
text-align: left;
font-family: 'Quicksand', sans-serif;
}
/* IPAD Portrait */
@media only screen
and (min-device-width: 768px)
and (max-device-width: 1024px)
and (orientation: portrait)
and (-webkit-min-device-pixel-ratio: 1) {
.flex-container{
padding-top: 3rem;
margin: 1rem;
display: flex; /* or inline-flex */
height: 40rem; /* Or whatever */
width: 42rem;
flex-direction: column;
align-items: stretch;
justify-content: space-around;
}
.flex-item {
margin: 1rem;
padding: 2rem;
}
}
/* Desktop */
@media only screen and (min-width: 1024px) {
.flex-container{
height: 25rem; /* Or whatever */
display: flex; /* or inline-flex */
flex-direction: row;
align-items: stretch;
}
.flex-item {
width: 45rem;
margin: 2rem;
padding: 2rem;
}
}
diff --git a/assets/src/pages/welcome.js b/assets/src/pages/welcome.js
index 109174bd..78eb8fe0 100644
--- a/assets/src/pages/welcome.js
+++ b/assets/src/pages/welcome.js
@@ -1,37 +1,54 @@
+/*
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+*/
+
import React from "react"
import "./welcome.css"
import Button from '@material-ui/core/Button';
export default function Welcome () {
return (
Welcome!
-
The OACCT (Open Access Compliance Check Tool) is an online resource, tailored to the Swiss academic community's needs, that gathers the most important information concerning Open-Access publishing.
+
The OACT (Open Access Check Tool) is an online resource, tailored to the Swiss academic community's needs, that gathers the most important information concerning Open-Access publishing.
Mission
Its principal goal is to guide Swiss researchers in deciding where and how to publish their works in compliance with funders’ and institutional Open Access policies
Where do our data come from?
-
OACCT provides a list of journals with information aggregated from several sources on a regular basis:
+
OACT provides a list of journals with information aggregated from several sources on a regular basis:
Journal ISSNs (source: ISSN International centre)
Publication conditions (source: Sherpa/Romeo)
Swiss institutions from swissuniversities
)
}
diff --git a/assets/src/reportWebVitals.js b/assets/src/reportWebVitals.js
index 5253d3ad..3e1b38f4 100644
--- a/assets/src/reportWebVitals.js
+++ b/assets/src/reportWebVitals.js
@@ -1,13 +1,30 @@
+/*
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+*/
+
const reportWebVitals = onPerfEntry => {
if (onPerfEntry && onPerfEntry instanceof Function) {
import('web-vitals').then(({ getCLS, getFID, getFCP, getLCP, getTTFB }) => {
getCLS(onPerfEntry);
getFID(onPerfEntry);
getFCP(onPerfEntry);
getLCP(onPerfEntry);
getTTFB(onPerfEntry);
});
}
};
export default reportWebVitals;
diff --git a/assets/src/services/Api.js b/assets/src/services/Api.js
index b0e7d193..bf2fbde3 100644
--- a/assets/src/services/Api.js
+++ b/assets/src/services/Api.js
@@ -1,19 +1,36 @@
+/*
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+*/
+
import axios from 'axios'
const Api = axios.create({
baseURL: `/api/`,
})
export default Api
//How to manage the different adresses dev, prod ?
//docker-compose up url http://0.0.0.0:8000/api/
//local: http://127.0.0.1:8000/api/
//Dev: https://oacct-dev.epfl.ch/api/
//Test: https://oacct-test.epfl.ch/api/
//Dev: https://oacct-dev.epfl.ch/api/
diff --git a/assets/src/services/requests/Condition.js b/assets/src/services/requests/Condition.js
index b2ff9d39..06b65857 100644
--- a/assets/src/services/requests/Condition.js
+++ b/assets/src/services/requests/Condition.js
@@ -1,56 +1,73 @@
+/*
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+*/
+
import Api from '../Api'
let today = new Date().toISOString().slice(0, 10)
var date_filter = 'ge(journalcondition.valid_until,' + today +
'),le(journalcondition.valid_from,' + today +
'),ge(organizationcondition.valid_until,' + today +
'),le(organizationcondition.valid_from,' + today + ')'
// To stop filtering by validity dates, replace by a trivial filter such as
// var date_filter = 'ge(condition.id,0)'
export const getCondition = (id) => {
return Api.request({
url: `/conditionterm/${id}`,
method: 'GET',
})
}
export const getListOfCondition = () => {
return Api.request({
url: `/conditionterm/`,
method: 'GET',
})
}
export const searchCondi = (journalId,institId) => {
return Api.request({
url: `/conditionset_light/?and(eq(journalcondition.journal.id,${journalId}),eq(organizationcondition.organization.id,${institId}),${date_filter})`,
method: 'GET',
})
}
export const searchInstitFunder = (institId,funderId,condi) => {
return Api.request({
url: `/conditionset_light/?(eq(organizationcondition.organization.id,${institId})|eq(organizationcondition.organization.id,${funderId})),ne(condition_type.id,${condi}),and(${date_filter})`,
method: 'GET',
})
}
export const searchCondi3 = (institId,journalId,funderId) => {
return Api.request({
url: `/conditionset_light/?(eq(organizationcondition.organization.id,${institId})|eq(organizationcondition.organization.id,${funderId})),eq(journalcondition.journal.id,${journalId}),and(${date_filter})`,
method: 'GET',
})
}
export const searchorganizationonly = (id,condi) => {
return Api.request({
url: `/conditionset_light/?and(eq(organizationcondition.organization.id,${id}),ne(condition_type.id,${condi}),${date_filter})`,
method: 'GET',
})
}
export const searchjournalonly = (id,condi) => {
return Api.request({
url: `/conditionset_light/?and(eq(journalcondition.journal.id,${id}),ne(condition_type.id,${condi}),${date_filter})`,
method: 'GET',
})
}
\ No newline at end of file
diff --git a/assets/src/services/requests/Funder.js b/assets/src/services/requests/Funder.js
index 4b1ceaae..260bd8a2 100644
--- a/assets/src/services/requests/Funder.js
+++ b/assets/src/services/requests/Funder.js
@@ -1,21 +1,38 @@
+/*
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+*/
+
import Api from '../Api'
export const getFunder = (id) => {
return Api.request({
url: `/funder/${id}`,
method: 'GET',
})
}
export const getListOfFunder = () => {
return Api.request({
url: `/funder/`,
method: 'GET',
})
}
export const searchFunderCondi = (id,condi) => {
return Api.request({
url: `/organizationcondition/?and(eq(organization.id,${id}),ne(condition_set.condition_type.id,${condi}))`,
method: 'GET',
})
}
\ No newline at end of file
diff --git a/assets/src/services/requests/Institution.js b/assets/src/services/requests/Institution.js
index ec2275e3..845c1ec6 100644
--- a/assets/src/services/requests/Institution.js
+++ b/assets/src/services/requests/Institution.js
@@ -1,35 +1,52 @@
+/*
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+*/
+
import Api from '../Api'
export const getInstitution = (id) => {
return Api.request({
url: `/organization/${id}`,
method: 'GET',
})
}
export const getListOfInstitution = () => {
return Api.request({
url: `/organization/`,
method: 'GET',
})
}
export const searchListOfInstitutionCondi = (id,condi) => {
return Api.request({
url: `/organizationcondition/?and(eq(organization.id,${id}),ne(condition_set.condition_type.id,${condi}))`,
method: 'GET',
})
}
export const getListOfCondiInstitution = () => {
return Api.request({
url: `/organizationcondition/`,
method: 'GET',
})
}
export const getInstitutionCondi = (id) => {
return Api.request({
url: `/organizationcondition/${id}`,
method: 'GET',
})
}
\ No newline at end of file
diff --git a/assets/src/services/requests/Journal.js b/assets/src/services/requests/Journal.js
index c5821772..a618e138 100644
--- a/assets/src/services/requests/Journal.js
+++ b/assets/src/services/requests/Journal.js
@@ -1,24 +1,41 @@
+/*
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+*/
+
import Api from '../Api'
export const getJournal = (id) => {
return Api.request({
url: `/journal/${id}`,
method: 'GET',
})
}
export const searchListOfJournalCondi = (id,condi) => {
return Api.request({
url: `/journalcondition/?and(eq(journal.id,${id}),ne(condition_set.condition_type.id,${condi}))`,
method: 'GET',
})
}
export const getListOfJournal = () => {
return Api.request({
url: `/journal_light/`,
method: 'GET',
})
}
\ No newline at end of file
diff --git a/assets/src/services/requests/requests.md b/assets/src/services/requests/requests.md
index 55406f0b..7a099a64 100644
--- a/assets/src/services/requests/requests.md
+++ b/assets/src/services/requests/requests.md
@@ -1,24 +1,24 @@
## Requests React location
`src/services/requests`
## Rql
RQL (Resource query language) is designed for modern application development. It is built for the web, ready for NoSQL, and highly extensible with simple syntax. This is a query language fast and convenient database interaction. RQL was designed for use in URLs to request object-style data structures.
source: [django-rql](https://django-rql.readthedocs.io/)
-# Why Rql for OACCT?
+# Why Rql for OACT?
-The OACCT's data structure has a complicated design to allow different data management use cases (add, update, delete)via API/Backend Admin/frontend.
+The OACCT's data structure is designed to allow different data management use cases (add, update, delete) via API/Backend Admin/frontend. A flexible API language such as Rql can fully support these use cases.
Rql allow us to do different requests with filters included inside the Url:
Exemple inside Condition.js:
`/conditionset/?and(eq(journalcondition.journal.id,${id}),ne(condition_type.id,${condi}),${date_filter})`
Rql language is fully integrated into Django Rest Framework.
It allow us to test the request manually directly to the url adress without changing the models or views.
Exemple on dev Url:
`https://oacct-dev.epfl.ch/api/conditionset/?and(eq(journalcondition.journal.id,3),eq(organizationcondition.organization.id,11),eq(condition_type.id,1))`
diff --git a/assets/src/setupTests.js b/assets/src/setupTests.js
index 8f2609b7..ad2ab820 100644
--- a/assets/src/setupTests.js
+++ b/assets/src/setupTests.js
@@ -1,5 +1,22 @@
+/*
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+*/
+
// jest-dom adds custom jest matchers for asserting on DOM nodes.
// allows you to do things like:
// expect(element).toHaveTextContent(/react/i)
// learn more: https://github.com/testing-library/jest-dom
import '@testing-library/jest-dom';
diff --git a/conf/nginx-app.conf b/conf/nginx-app.conf
index b9b4aa77..b5b0af42 100644
--- a/conf/nginx-app.conf
+++ b/conf/nginx-app.conf
@@ -1,184 +1,186 @@
# nginx-app.conf
# Enable CORS for selected origins
# map instead of many if's
map $http_origin $cors {
default "null";
"https://www.test-cors.org" $http_origin;
"https://www.epfl.ch" $http_origin;
"http://127.0.0.1" $http_origin;
"https://localhost" $http_origin;
}
# the upstream component nginx needs to connect to
upstream django {
server unix:/oacct_checker/app.sock; # for a file socket
# server 127.0.0.1:8001; # for a web port socket (we'll use this first)
}
# We want to see the original IP of HTTP requests, not the one from the Openshift gateway
-set_real_ip_from 172.31.0.0/16;
-set_real_ip_from 10.180.21.0/24;
-set_real_ip_from 127.0.0.1/8;
+#set_real_ip_from 172.31.0.0/16;
+#set_real_ip_from 10.180.21.0/24;
+#set_real_ip_from 127.0.0.1/8;
# log format as per C2C recommandation 2022-03-28
-log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"';
+log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" -- "$http_x_forwarded_for" -- "$http_x_real_ip"';
+access_log /dev/stdout main;
+error_log /dev/stderr info;
+
server {
listen 8080 default_server;
#listen [::]:80 ;
server_name 127.0.0.1;
## Redirige le HTTP vers le HTTPS ##
#return 301 https://$server_name$request_uri;
# default max body size of 1M not sufficient for 1000 journals
client_max_body_size 100M;
add_header "Content-Security-Policy" "default-src 'self' https://web2018.epfl.ch https://cdn.datatables.net";
add_header "Strict-Transport-Security" "max-age=31536000";
-
# Django media; not needed in this project
location /media {
alias /oacct_checker; # your Django project's media files - amend as required
}
location /static {
alias /oacct_checker/staticfiles; # your Django project's static files - amend as required
# Simple requests
if ($request_method ~* "(GET|POST)") {
add_header "Access-Control-Allow-Origin" "$cors";
}
# Preflighted requests
if ($request_method = OPTIONS ) {
add_header "Access-Control-Allow-Origin" "$cors";
add_header "Access-Control-Allow-Methods" "GET, POST, OPTIONS, HEAD";
add_header "Access-Control-Allow-Headers" "Authorization, Origin, X-Requested-With, Content-Type, Accept";
return 200;
}
}
location /sphinx {
alias /oacct_checker/sphinx/_build/html; # Sphinx documentation served separately
# Simple requests (standard, probably overkill)
if ($request_method ~* "(GET|POST)") {
add_header "Access-Control-Allow-Origin" "$cors";
}
# Preflighted requests
if ($request_method = OPTIONS ) {
add_header "Access-Control-Allow-Origin" "$cors";
add_header "Access-Control-Allow-Methods" "GET, POST, OPTIONS, HEAD";
add_header "Access-Control-Allow-Headers" "Authorization, Origin, X-Requested-With, Content-Type, Accept";
return 200;
}
}
location /styleguide {
alias /oacct_checker/reactDoc/styleguide; # Sphinx documentation served separately
# Simple requests
if ($request_method ~* "(GET|POST)") {
add_header "Access-Control-Allow-Origin" "$cors";
}
# Preflighted requests
if ($request_method = OPTIONS ) {
add_header "Access-Control-Allow-Origin" "$cors";
add_header "Access-Control-Allow-Methods" "GET, POST, OPTIONS, HEAD";
add_header "Access-Control-Allow-Headers" "Authorization, Origin, X-Requested-With, Content-Type, Accept";
return 200;
}
}
# Finally, send all non-media requests to the Django server.
location / {
uwsgi_pass django;
include /oacct_checker/conf/uwsgi_params; # the uwsgi_params file you installed
# Simple requests
if ($request_method ~* "(GET|POST)") {
add_header "Access-Control-Allow-Origin" "$cors";
}
# Preflighted requests
if ($request_method = OPTIONS ) {
add_header "Access-Control-Allow-Origin" "$cors";
add_header "Access-Control-Allow-Methods" "GET, POST, OPTIONS, HEAD";
add_header "Access-Control-Allow-Headers" "Authorization, Origin, X-Requested-With, Content-Type, Accept";
return 200;
}
}
}
# configuration of the server
server {
# the port your site will be served on, default_server indicates that this server block
# is the block to use if no blocks match the server_name
# SSL configuration
listen 4443 ssl http2 default_server;
listen [::]:4443 ssl http2 ;
include snippets/self-signed.conf;
include snippets/ssl-params.conf;
# the domain name it will serve for
server_name 127.0.0.1; # substitute your machine's IP address or FQDN
charset utf-8;
# max upload size
client_max_body_size 75M; # adjust to taste
add_header "Content-Security-Policy" "default-src 'self' https://web2018.epfl.ch https://cdn.datatables.net";
add_header "Strict-Transport-Security" "max-age=31536000";
# Django media
location /media {
alias /oacct_checker; # your Django project's media files - amend as required
}
location /static {
alias /oacct_checker/staticfiles; # your Django project's static files - amend as required
# Simple requests
if ($request_method ~* "(GET|POST)") {
add_header "Access-Control-Allow-Origin" "$cors";
}
# Preflighted requests
if ($request_method = OPTIONS ) {
add_header "Access-Control-Allow-Origin" "$cors";
add_header "Access-Control-Allow-Methods" "GET, POST, OPTIONS, HEAD";
add_header "Access-Control-Allow-Headers" "Authorization, Origin, X-Requested-With, Content-Type, Accept";
return 200;
}
}
# Finally, send all non-media requests to the Django server.
location / {
uwsgi_pass django;
include /oacct_checker/conf/uwsgi_params; # the uwsgi_params file you installed
# default timout of 60s too short for significant JSON uploads?
uwsgi_read_timeout 300s;
uwsgi_send_timeout 300s;
# Simple requests
if ($request_method ~* "(GET|POST)") {
add_header "Access-Control-Allow-Origin" "$cors";
}
# Preflighted requests
if ($request_method = OPTIONS ) {
add_header "Access-Control-Allow-Origin" "$cors";
add_header "Access-Control-Allow-Methods" "GET, POST, OPTIONS, HEAD";
add_header "Access-Control-Allow-Headers" "Authorization, Origin, X-Requested-With, Content-Type, Accept";
return 200;
}
}
}
diff --git a/conf/supervisor-app.conf b/conf/supervisor-app.conf
index 44c5bd95..ff989ff2 100644
--- a/conf/supervisor-app.conf
+++ b/conf/supervisor-app.conf
@@ -1,10 +1,13 @@
[program:app-uwsgi]
command = /usr/local/bin/uwsgi --ini /oacct_checker/uwsgi.ini
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stdout
stderr_logfile_maxbytes=0
[program:nginx-app]
command = /usr/sbin/nginx
-
+stdout_logfile=/dev/stdout
+stdout_logfile_maxbytes=0
+stderr_logfile=/dev/stdout
+stderr_logfile_maxbytes=0
diff --git a/conf/uwsgi_params b/conf/uwsgi_params
index 52c6a4da..2c16a192 100644
--- a/conf/uwsgi_params
+++ b/conf/uwsgi_params
@@ -1,18 +1,18 @@
uwsgi_param QUERY_STRING $query_string;
uwsgi_param REQUEST_METHOD $request_method;
uwsgi_param CONTENT_TYPE $content_type;
uwsgi_param CONTENT_LENGTH $content_length;
uwsgi_param REQUEST_URI $request_uri;
uwsgi_param PATH_INFO $document_uri;
uwsgi_param DOCUMENT_ROOT $document_root;
uwsgi_param SERVER_PROTOCOL $server_protocol;
uwsgi_param HTTPS $https if_not_empty;
-uwsgi_param X-Real-IP $remote_addr;
+uwsgi_param X-Real-IP $http_x_forwarded_for;
uwsgi_param REMOTE_ADDR $remote_addr;
uwsgi_param REMOTE_PORT $remote_port;
uwsgi_param SERVER_PORT $server_port;
uwsgi_param SERVER_NAME $server_name;
diff --git a/django_api/__init__.py b/django_api/__init__.py
index e69de29b..dfa285ae 100644
--- a/django_api/__init__.py
+++ b/django_api/__init__.py
@@ -0,0 +1,17 @@
+"""
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+"""
+
diff --git a/django_api/admin.py b/django_api/admin.py
index 96cc95cb..fb561423 100644
--- a/django_api/admin.py
+++ b/django_api/admin.py
@@ -1,506 +1,621 @@
+"""
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+"""
+
""" Django admin module for the django_api application
All admin pages inherit from import_export.admin.ImportExportModelAdmin for JSON import/export.
"""
from django.contrib import admin
from django import forms
from datetime import date, datetime
import re
from import_export.admin import ImportExportModelAdmin
from django.contrib.admin import TabularInline
from django.contrib.admin import SimpleListFilter
from django.contrib.admin import RelatedOnlyFieldListFilter
+from django.db.models import Prefetch
from django.forms.models import BaseInlineFormSet
from django.utils.translation import gettext_lazy as _
#from inline_actions.admin import InlineActionsMixin
#from inline_actions.admin import InlineActionsModelAdminMixin
from django.shortcuts import render
from django.http import HttpResponseRedirect
from django.urls import reverse
from django.utils.html import escape, mark_safe, format_html
from .models import Country
from .models import Language
from .models import Issn
from .models import Oa
from .models import Publisher
from .models import Journal
from .models import Organization
from .models import Version
from .models import Licence
from .models import Cost_factor_type
from .models import Cost_factor
from .models import Term
from .models import ConditionType
from .models import ConditionSubType
from .models import ConditionSet
from .models import OrganizationCondition
from .models import JournalCondition
# Register your models here.
@admin.register(Issn)
class IssnAdmin(ImportExportModelAdmin):
# TODO use RelatedOnlyFieldListFilter for publisher when data allows it
list_filter = ('issn_type', 'journal__publisher__name', )
list_display = ("id", "issn", 'journal')
class IssnInline(admin.TabularInline):
model = Issn
readonly_fields = ('issn', 'issn_type',)
# This Inline is stricty read-only for the moment
def has_change_permission(self, request, obj=None):
return False
def has_add_permission(self, request, obj=None):
return False
def has_delete_permission(self, request, obj=None):
return False
@admin.register(Journal)
class JournalAdmin(ImportExportModelAdmin):
list_display = ("id", "name", "get_journal_issns",)
# TODO use RelatedOnlyFieldListFilter for publisher when data allows it
list_filter = ('oa_status', 'publisher', )
filter_horizontal = ('publisher', 'language', )
search_fields = ('name', 'classIssn__issn')
inlines = (IssnInline, )
@admin.display(description='ISSNs')
def get_journal_issns(self, obj):
return list(Issn.objects.filter(journal=obj))
@admin.register(Language)
class LanguageAdmin(ImportExportModelAdmin):
pass
@admin.register(Organization)
class OrganizationAdmin(ImportExportModelAdmin):
list_display = ("id", "name")
list_filter = ('is_funder', ('country', RelatedOnlyFieldListFilter))
filter_horizontal = ('country', )
search_fields = ('name', )
@admin.register(Version)
class VersionAdmin(ImportExportModelAdmin):
pass
@admin.register(Country)
class CountryAdmin(ImportExportModelAdmin):
pass
@admin.register(Oa)
class OaAdmin(ImportExportModelAdmin):
pass
@admin.register(Publisher)
class PublisherAdmin(ImportExportModelAdmin):
list_display = ("id", "name")
# Experimental: what will happen with 200+ countries in the database?
list_filter = (('country', RelatedOnlyFieldListFilter), )
filter_horizontal = ('country', )
@admin.register(Term)
class TermAdmin(ImportExportModelAdmin):
list_display = ("id", "__str__", )
list_filter = ('version', 'licence', 'ir_archiving')
search_fields = ("id", "comment", "embargo_months")
filter_horizontal = ('version', 'cost_factor', 'licence', )
# textarea input is better for comments
def get_form(self, request, obj=None, **kwargs):
kwargs['widgets'] = {'comment': forms.Textarea}
return super().get_form(request, obj, **kwargs)
@admin.register(ConditionType)
class ConditionTypeAdmin(ImportExportModelAdmin):
list_display = ("id", "condition_issuer")
@admin.register(ConditionSubType)
class ConditionTypeAdmin(ImportExportModelAdmin):
list_display = ("id", "label")
class JournalConditionFormset(forms.BaseInlineFormSet):
def __init__(self, *args, **kwargs):
super(JournalConditionFormset, self).__init__(*args, **kwargs)
- self.queryset = self.queryset.select_related("journal", 'condition_set')
+ self.queryset = self.queryset.select_related('journal', 'condition_set', 'condition_set__condition_type')
+ #self.queryset = self.queryset.prefetch_related('journal')
+ #print('JournalConditionFormset.queryset length ', len(self.queryset))
+ # print(self.queryset.prefetch_related('journal').__dict__)
+
+
+class JournalConditionInlineForm(forms.ModelForm):
+ class Meta:
+ model = JournalCondition
+ exclude = ()
+
+ def __init__(self, *args, **kwargs):
+ super(JournalConditionInlineForm, self).__init__(*args, **kwargs)
+ #print('JournalConditionInlineForm created')
+ #self.fields['journal'].queryset = JournalCondition.objects.select_related('journal').all()
+
#class JournalConditionInline(InlineActionsMixin, TabularInline):
class JournalConditionInline(TabularInline):
model = JournalCondition
fields = ('journal', 'valid_from', 'valid_until')
+ # ordering = ('journal__name', 'valid_from', 'valid_until')
extra = 1
#inline_actions = ['connect_all_journals']
autocomplete_fields = ('journal', )
- formset = JournalConditionFormset
+ fk_name = 'condition_set'
+ formset = JournalConditionFormset
+ #form = JournalConditionInlineForm
+
+ # IN PROGRESS 2022-04-20 we'll get back to it later
+ #form = JournalConditionInlineForm
+
+ def get_queryset(self, *args, **kwargs):
+ qs = super().get_queryset(*args, **kwargs).select_related('journal')
+ print('JournalConditionInline.get_queryset() called')
+ print(qs.__dict__)
+ print(qs.all())
+ return qs
+
+ def dummy_get_formset(self, request, obj=None, **kwargs):
+ formset = super(JournalConditionInline, self).get_formset(request, obj, **kwargs)
+ queryset = formset.form.base_fields["journal"].queryset
+ formset.form.base_fields["journal"].queryset = queryset
+ return formset
+
+ # not working AB 2022-04-27
+ """
+ def formfield_for_foreignkey(self, db_field, request, **kwargs):
+ if 'queryset' in kwargs:
+ kwargs['queryset'] = kwargs['queryset'].select_related()
+ else:
+ db = kwargs.pop('using', None)
+ kwargs['queryset'] = db_field.remote_field.to._default_manager.using(db).complex_filter(db_field.remote_field.limit_choices_to).select_related()
+ return super(JournalConditionInline, self).formfield_for_foreignkey(db_field, request, **kwargs)
+ """
"""
def connect_all_journals(self, request, obj, parent_obj=None):
# Do stuff here, then return None to go to current view
return None
connect_all_journals.short_description = ("Connect Condition Set with some or all Journals")
"""
- def get_queryset(self, request):
- qs = super(JournalConditionInline, self).get_queryset(request).prefetch_related()
- return qs.select_related('journal')
+ #def get_queryset(self, request):
+ # qs = super(JournalConditionInline, self).get_queryset(request).prefetch_related('journal', 'condition_set')
+ # return qs.select_related('journal', 'condition_set', 'condition_set__condition_type')
+
+class SimpleJournalConditionInline(TabularInline):
+ model = JournalCondition
+ fields = ('journal', 'valid_from', 'valid_until')
+ # ordering = ('journal__name', 'valid_from', 'valid_until')
+ extra = 1
+ #autocomplete_fields = ('journal', )
+ fk_name = 'condition_set'
+
+ def get_queryset(self, *args, **kwargs):
+ qs = super().get_queryset(*args, **kwargs).select_related('journal')
+ #print('JournalConditionInline.get_queryset() called')
+ #print(qs.__dict__)
+ #print(qs.all())
+ return qs
+
class OrganizationConditionFormset(forms.BaseInlineFormSet):
def __init__(self, *args, **kwargs):
super(OrganizationConditionFormset, self).__init__(*args, **kwargs)
self.queryset = self.queryset.select_related("organization", 'condition_set')
class OrganizationConditionInline(TabularInline):
#class OrganizationConditionInline(InlineActionsMixin, TabularInline):
# model = OrganizationCondition
model = ConditionSet.organization.through
extra = 1
autocomplete_fields = ('organization', )
- formset = OrganizationConditionFormset
+ formset = OrganizationConditionFormset
+
+ def get_queryset(self, request):
+ qs = super(OrganizationConditionInline, self).get_queryset(request).prefetch_related('organization', 'condition_set')
+ return qs.select_related('organization', 'condition_set', 'condition_set__condition_type')
@admin.action(description='Apply selected condition sets to multiple Journals')
def connect_with_all_journals(modeladmin, request, queryset):
""" Action applicable to one or more ConditionSets: connect with a list of Journals
by entering the ISSN for the relevant ones, or all Journals if no ISSN is given.
Start and end dates must be provided during the action.
This action is useful to connect a new organization policy or publishing
agreement with the journals to which it applies.
"""
if request.POST.get('apply'):
try:
valid_from = date.fromisoformat(request.POST['valid_from'])
valid_until = date.fromisoformat(request.POST['valid_until'])
issn_list = set([x for x in re.split(' |,|;|\n|\r|\t', request.POST['issn_list']) if len(x) > 0])
print(issn_list)
if valid_from > valid_until:
raise ValueError
# print((valid_from, valid_until))
if len(issn_list) == 0:
all_journals = Journal.objects.all()
else:
journal_ids = list(Issn.objects.filter(issn__in=issn_list).values_list('journal', flat=True).distinct())
# print(journal_ids)
all_journals = Journal.objects.filter(id__in=journal_ids)
print(all_journals)
print(len(issn_list), len(all_journals))
# all_journals =[]
# The following block could certainly be optimized! AB 2021-08-12
for condition_set in queryset:
# print('-----------------')
# print(condition_set)
for j in all_journals:
# print(j)
# search for existing connections
existing_connections = JournalCondition.objects.filter(journal=j,
condition_set=condition_set,
valid_from__lt=date.today(),
valid_until__gt=date.today())
# print(existing_connections)
if len(existing_connections) == 0:
new_journal_condition = JournalCondition(journal=j,
condition_set=condition_set,
valid_from=valid_from,
valid_until=valid_until)
new_journal_condition.save()
else:
# This should not happen, or could it?
print(f'{j} already connected with {condition_set}')
return None
except ValueError:
pass
return render(request, 'admin/get_validity_dates.html', context={'queryset': queryset, 'objects': 'journals'})
+@admin.action(description='Unlink selected condition sets from all Journals')
+def disconnect_from_all_journals(modeladmin, request, queryset):
+ """ Action applicable to one or more ConditionSets:
+ disconnect from all journals
+ This action is useful in cases where it is easier to unlink the ConditionSet,
+ modify it and re-link later than to work through the ConditionSet admin page
+ (for example for org. policies applicable by default to all possible journals)
+ """
+ warning = 'Are you sure you want to unlink all journals? Only the relationship is affected, '
+ warning += 'no journal data will be destroyed - but hey, think about it first.'
+ if request.POST.get('apply'):
+ try:
+ for condition_set in queryset:
+ # print('-----------------')
+ # print(condition_set)
+ condition_set.journal.clear()
+ condition_set.save()
+ return None
+ except ValueError:
+ pass
+ return render(request, 'admin/are_you_sure.html', context={'queryset': queryset, 'text': warning, 'function': 'disconnect_from_all_journals'})
+
+
@admin.action(description='Apply selected condition sets to all Organizations')
def connect_with_all_organizations(modeladmin, request, queryset):
""" Action applicable to one or more ConditionSets: connect with all Organizations.
Start and end dates must be provided during the action.
This action is useful to connect a new journal policy with all known organizations.
"""
if request.POST.get('apply'):
try:
valid_from = date.fromisoformat(request.POST['valid_from'])
valid_until = date.fromisoformat(request.POST['valid_until'])
if valid_from > valid_until:
raise ValueError
# print((valid_from, valid_until))
all_orgs = Organization.objects.all()
for condition_set in queryset:
# print('-----------------')
# print(condition_set)
for o in all_orgs:
# print(o)
# search for existing connections
existing_connections = OrganizationCondition.objects.filter(organization=o,
condition_set=condition_set,
valid_from__lt=date.today(),
valid_until__gt=date.today())
# print(existing_connections)
if len(existing_connections) == 0:
new_organization_condition = OrganizationCondition(organization=o,
condition_set=condition_set,
valid_from=valid_from,
valid_until=valid_until)
new_organization_condition.save()
return None
except ValueError:
pass
return render(request, 'admin/get_validity_dates.html', context={'queryset': queryset, 'objects': 'organizations'})
@admin.action(description='Set valid_until date')
def end_validity(modeladmin, request, queryset):
""" Action to set the end date for selected Journal-Condition relationships.
This action was introduced to add validity dates to batch-uploaded JournalConditions that lacked this information.
"""
if request.POST.get('apply'):
try:
valid_until = date.fromisoformat(request.POST['date'])
queryset.update(valid_until=valid_until)
return None
except ValueError:
pass
return render(request, 'admin/get_single_validity_date.html',
context={'queryset': queryset.prefetch_related(), 'limit': 'end', 'objects': 'selected journal-condition connections'})
@admin.action(description='Set valid_from date')
def start_validity(modeladmin, request, queryset):
""" Action to set the start date for selected Journal-Condition relationships.
This action was introduced to add validity dates to batch-uploaded JournalConditions that lacked this information.
"""
if request.POST.get('apply'):
try:
valid_from = date.fromisoformat(request.POST['date'])
queryset.update(valid_from=valid_from)
return None
except ValueError:
pass
return render(request, 'admin/get_single_validity_date.html',
context={'queryset': queryset, 'limit': 'start', 'objects': 'selected journal-condition connections'})
class ConditionSetAdminForm(forms.ModelForm):
class Meta:
model = ConditionSet
- fields = ['condition_type', 'subtype', 'term', 'source', 'comment', ]
+ fields = ['condition_type', 'subtype', 'term', 'source', 'comment', 'organization', 'journal']
def __init__(self, *args, **kwargs):
#start = datetime.now()
super(ConditionSetAdminForm, self).__init__(*args, **kwargs)
+ #print(self.__dict__)
self.fields['term'].queryset = Term.objects.all().prefetch_related('licence', 'cost_factor', 'version')
+
#print('ConditionSetAdminForm.__init__(): ', datetime.now(), datetime.now()-start)
@admin.register(ConditionSet)
class ConditionSetAdmin(ImportExportModelAdmin):
# class ConditionSetAdmin(InlineActionsModelAdminMixin, ImportExportModelAdmin):
list_display = ("id", "condition_type", "comment")
search_fields = ['organization__name', 'journal__name', 'comment', 'id', 'condition_type__condition_issuer']
list_filter = ('condition_type', 'journal__publisher__name', 'organization__name', )
form = ConditionSetAdminForm
filter_horizontal = ('term', )
- exclude = ('organization', 'journal', )
+ #exclude = ('organization', 'journal', )
inlines = (OrganizationConditionInline, JournalConditionInline, )
- # inlines = (OrganizationConditionInline, )
- actions = [connect_with_all_journals, connect_with_all_organizations]
+ #inlines = (OrganizationConditionInline, )
+ actions = [connect_with_all_journals, connect_with_all_organizations, disconnect_from_all_journals]
# textarea input is better for comments
def get_form(self, request, obj=None, **kwargs):
kwargs['widgets'] = {'comment': forms.Textarea}
return super().get_form(request, obj, **kwargs)
def get_queryset(self, request):
#start = datetime.now()
- test_model_qs = super(ConditionSetAdmin, self).get_queryset(request)
- test_model_qs = test_model_qs.prefetch_related('organization', 'journal')
+ test_model_qs = super(ConditionSetAdmin, self).get_queryset(request).select_related('condition_type')
+ # test_model_qs = test_model_qs.prefetch_related('organizationcondition_set', 'journalcondition_set')
+
+ # This could be the cause of 502 errors (out-of-memory killing)
+ #test_model_qs = test_model_qs.prefetch_related(Prefetch('journalcondition_set',
+ # queryset=JournalCondition.objects.select_related('condition_set', 'journal')))
+ #test_model_qs = test_model_qs.prefetch_related('journal')
+ #print(test_model_qs.__dict__)
+
#print('ConditionSetAdmin.get_queryset(): ', datetime.now(), datetime.now()-start)
return test_model_qs
class XConditionValidListFilter(SimpleListFilter):
""" Human-readable title which will be displayed in the
right admin sidebar just above the filter options.
"""
title = _('currently valid')
# Parameter for the filter that will be used in the URL query.
parameter_name = 'valid'
def lookups(self, request, model_admin):
"""
Returns a list of tuples. The first element in each
tuple is the coded value for the option that will
appear in the URL query. The second element is the
human-readable name for the option that will appear
in the right sidebar.
"""
return (
('true', _('True')),
('false', _('False')),
)
def queryset(self, request, queryset):
"""
Returns the filtered queryset based on the value
provided in the query string and retrievable via
`self.value()`.
"""
# Compare the requested value (either '80s' or '90s')
# to decide how to filter the queryset.
if self.value() == 'true':
return queryset.filter(valid_from__lte=date.today(),
valid_until__gte=date.today())
if self.value() == 'false':
return queryset.exclude(valid_from__lte=date.today(),
valid_until__gte=date.today())
class ConditionSetListDynamicFilter(SimpleListFilter):
""" Dynamic filter-by-publisher for ConditionSets. Only Publishers of Journals
connected to the currently displayed ConditionSets are proposed.
"""
title = _('condition sets (publisher-dependant)')
parameter_name = 'condition_set'
def lookups(self, request, model_admin):
if 'journal__publisher__name' in request.GET:
# A publisher name filter is in effect
journal_publisher_name = request.GET['journal__publisher__name']
print([journal_publisher_name])
#cs_by_publisher = model_admin.model.objects.filter(journal__publisher__name=journal_publisher_name)
#print(cs_by_publisher)
#jcs_by_publisher = model_admin.model.objects.all().filter(journal__publisher__name=journal_publisher_name).prefetch_related()
#condition_sets = set([c.condition_set for c in model_admin.model.objects.all().filter(journal__publisher__name=journal_publisher_name)])
#condition_sets = sorted(list(condition_sets))
cs = model_admin.model.objects.filter(journal__publisher__name=journal_publisher_name).values_list('condition_set')
condition_sets = ConditionSet.objects.filter(id__in=cs).order_by('id')
print(condition_sets)
#condition_sets = ConditionSet.objects.filter(journal__publisher__name=journal_publisher_name).order_by('id')
#condition_sets = model_admin.model.objects.filter(journal__publisher__name=journal_publisher_name).
else:
#condition_sets = set([c.condition_set for c in model_admin.model.objects.all()])
condition_sets = ConditionSet.objects.all().order_by('id')
return [(s.id, str(s)) for s in condition_sets]
def queryset(self, request, queryset):
if self.value():
return queryset.filter(condition_set__id__exact=self.value())
@admin.register(OrganizationCondition)
class OrganizationConditionAdmin(ImportExportModelAdmin):
""" Organization-ConditionSet connection admin page
"""
@admin.display(description='Condition Set')
def link_to_conditionset(self, obj):
""" Calculated field for the list display: link to the relevant ConditionSet
"""
link = reverse("admin:django_api_conditionset_change", args=[obj.condition_set.id])
return format_html(f'
{obj.condition_set} ')
search_fields = ("id", "organization__name", "condition_set__id")
list_display = ("id", "organization_name", "link_to_conditionset", "valid_from", "valid_until")
list_select_related = ('condition_set', )
list_filter = ('condition_set__condition_type', XConditionValidListFilter,
# This has become very slow on 2021-12-09, will revisit later
#('condition_set', RelatedOnlyFieldListFilter),
'condition_set__id')
def organization_name(self, obj):
return obj.organization.name
def get_queryset(self, request):
qs = super(OrganizationConditionAdmin, self).get_queryset(request).prefetch_related()
return qs.select_related('condition_set', 'organization', 'condition_set__condition_type', )
@admin.register(Licence)
class LicenceAdmin(ImportExportModelAdmin):
pass
@admin.register(JournalCondition)
class JournalConditionAdmin(ImportExportModelAdmin):
@admin.display(description='Condition Set')
def link_to_conditionset(self, obj):
link = reverse("admin:django_api_conditionset_change", args=[obj.condition_set.id])
return format_html(f'
{obj.condition_set} ')
search_fields = ("id", "journal__name", "condition_set__id")
list_display = ("id", "journal_name", "link_to_conditionset", "valid_from", "valid_until")
list_filter = ('condition_set__condition_type', XConditionValidListFilter,
'journal__publisher__name',
ConditionSetListDynamicFilter)
#list_filter = ('condition_set__condition_type', XConditionValidListFilter,
# 'journal__publisher__name',)
actions = (end_validity, start_validity, )
def journal_name(self, obj):
return obj.journal.name
def get_queryset(self, request):
qs = super(JournalConditionAdmin, self).get_queryset(request)
return qs.select_related('condition_set', 'journal')
# unsuccessful attempt
# def formfield_for_foreignkey(self, db_field, request, **kwargs):
# if db_field.name == "journal":
# kwargs["queryset"] = Journal.objects.filter(publisher__name__in=Publisher.objects.order_by().values('name').distinct())
# return super().formfield_for_foreignkey(db_field, request, **kwargs)
@admin.register(Cost_factor)
class Cost_factorAdmin(ImportExportModelAdmin):
list_display = ("id", "comment", "amount", "symbol")
list_filter = ('cost_factor_type', 'symbol')
# textarea input is better for comments
def get_form(self, request, obj=None, **kwargs):
kwargs['widgets'] = {'comment': forms.Textarea}
return super().get_form(request, obj, **kwargs)
@admin.register(Cost_factor_type)
class Cost_factor_typeAdmin(ImportExportModelAdmin):
list_display = ("id", "name")
diff --git a/django_api/apps.py b/django_api/apps.py
index 9f0069bc..cd2b2e35 100644
--- a/django_api/apps.py
+++ b/django_api/apps.py
@@ -1,6 +1,23 @@
+"""
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+"""
+
from django.apps import AppConfig
class DjangoApiConfig(AppConfig):
name = 'django_api'
- verbose_name = 'OACCT back-end'
+ verbose_name = 'OACT back-end'
diff --git a/django_api/models.py b/django_api/models.py
index 4adfb638..4e653ccb 100644
--- a/django_api/models.py
+++ b/django_api/models.py
@@ -1,479 +1,496 @@
"""
-Django object models for the django_api application of the OACCT project.
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+"""
+
+"""
+Django object models for the django_api application of the OACT project.
Ref: database_model_20210421_MB.drawio 21.04.2021
"""
from django.db import models
from django.contrib.auth.models import User
import datetime
from django.utils.translation import gettext as _
class Country(models.Model):
""" Countries: used as attributes by Publishers and Organizations
:param name: full English name
:type name: str, optional
:param iso_code: ISO 3166-1 Alpha-3 code https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3#Officially_assigned_code_elements
:type iso_code: str, optional
"""
name = models.CharField(verbose_name="Country name", max_length=120, null=True)
iso_code = models.CharField(max_length=3, null=True)
def __str__(self):
return f"{self.name}"
class Meta:
verbose_name_plural = 'Countries'
ordering = ('name',)
class Language(models.Model):
""" Languages: used as attributes by Journals
:param name: full English name
:type name: str, optional
:param iso_code: ISO 639-2 code https://en.wikipedia.org/wiki/ISO_639-2
:type iso_code: str, optional
"""
name = models.CharField(verbose_name="Language name", max_length=120, null=True)
iso_code = models.CharField(max_length=3, null=True)
def __str__(self):
return f"{self.name}"
class Meta:
ordering = ('name',)
class Oa(models.Model):
""" Open Access status: used as attribute by Journals
:param status: short name, ideally one word i.e. Green, Gold, UNKNOWN...
:type status: str, optional
:param description: description text up to 1000 characters
:type status: str, optional
:param subscription: does a journal with this status require a subscription?
:type subscription: bool
:param accepted_manuscript: does a journal with this status generally allow to distribute the accepted manuscript?
:type accepted_manuscript: bool
:param apc: does a journal with this status require Article Processing Charges (APCs)?
:type apc: bool
:param final_version: does a journal with this status generally allow to distribute the published version?
:type final_version: bool
"""
status = models.CharField(max_length=1000, null=True)
description = models.CharField(max_length=1000, null=True)
subscription = models.BooleanField(default=False)
accepted_manuscript = models.BooleanField(default=False)
apc = models.BooleanField(default=False)
final_version = models.BooleanField(default=False)
def __str__(self):
return f"{self.status}"
class Meta:
ordering = ('-subscription',)
verbose_name = "Open Access status"
verbose_name_plural = "Open Access statuses"
class Publisher(models.Model):
""" Publishers: corporations or societies in charge of Journals
:param name: name
:type status: str, optional
:param city: location of the main office
:type city: str, optional
:param state: if applicable, state or province
:type state: str, optional
:param country: home country or countries
:type country: many-to-many relationship with the `Country` class
:param starting_year: founding year
:type starting_year: int, optional
:param website: main web site
:type website: URL
:param oa_policies: web link to general Open Access policy if applicable
:type oa_policies: URL
"""
name = models.CharField(verbose_name="Publisher name", max_length=1000, null=True)
city = models.CharField(max_length=100, null=True)
state = models.CharField(max_length=3, null=True)
country = models.ManyToManyField("Country")
starting_year = models.IntegerField(blank=True, null=True)
website = models.URLField(max_length=1000)
oa_policies = models.URLField(max_length=1000)
def __str__(self):
return f"{self.name}"
class Meta:
ordering = ('name',)
class Issn(models.Model):
""" Issns: a multiple property of Journals
:param journal: Journal object to which the ISSN belongs
:type journal: class `Journal`, optional
:param issn: ISSN code such as 1234-5678
:type issn: str
:param issn_type: Print, Electronic or Other
:type issn_type: str
"""
PRINT = '1'
ELECTRONIC = '2'
OTHER = '3'
TYPE_CHOICES = (
(PRINT, 'Print'),
(ELECTRONIC, 'Electronic'),
(OTHER, 'Other'),
)
journal = models.ForeignKey("Journal", null=True, on_delete=models.CASCADE, related_name = "classIssn") #journal.classissn
issn = models.CharField(max_length=9, null=False)
"""ISSN code such as 1234-5678
"""
issn_type = models.CharField(
choices=TYPE_CHOICES,
max_length=10,
blank=True
)
def __str__(self):
return f"{self.issn} ({dict(self.TYPE_CHOICES)[self.issn_type]})"
class Meta:
ordering = ('issn',)
class Journal(models.Model):
""" Journals: one of the big entities in the application
:param name: journal title
:type name: str
:param name_short_iso_4: ISO 4 abbreviation of the title
:type name_short_iso_4: str
:param publisher: zero or more publishers in charge of the Journal
:type publisher: many-to-many relationshio with class `Publisher`
:param website: home page of the journal
:type website: URL
:param language: the journal publishes articles in these zero or more languages
:type language: many-to-many relationship with class `Journal`
:param oa_options: web page with the journal's Open Access conditions
:type oa_options: URL, optional
:param oa_status: Open Access status
:type oa_status: reference to an `Oa` object
:param starting_year: founding year
:type starting_year: int, optional
:param end_year: end year if applicable
:type ending_year: int, optional
:param doaj_seal: did the journal obtain the DOAJ Seal? https://doaj.org/apply/seal/
:type doaj_seal: bool
:param doaj_status: is the journal accepted in the Directory of Open Access Journals? https://doaj.org
:type doaj_status: bool
:param lockss: is the journal archived by LOCKSS? https://www.lockss.org/about
:type lockss: bool
:param nlch: please remind me what this is supposed to be
:type nlch: bool
:param portico: did the journal obtain the DOAJ Seal? https://doaj.org/apply/seal/
:type portico: is the journal archived by Portico? https://www.portico.org/
:param qoam_av_score: Quality Open Access Marker (QOAM) score https://www.qoam.eu/
:type qoam_av_score: decimal number
"""
name = models.CharField(verbose_name="Journal name", max_length=800, blank=True, null=True) # search journal with name
name_short_iso_4 = models.CharField(max_length=300, blank=True, null=True)
publisher = models.ManyToManyField(Publisher)
website = models.URLField(max_length=300, blank=True, null=True)
language = models.ManyToManyField(Language)
# 2021-08-11: only one-to-many relationship between Journal and ISSN
# issn = models.ForeignKey("Issn", null=True, on_delete=models.CASCADE)
oa_options = models.URLField(max_length=1000, blank=True, null=True)
oa_status = models.ForeignKey("Oa", related_name ="oa_status", on_delete=models.CASCADE, null=True)
starting_year = models.IntegerField(blank=True, null=True)
end_year = models.IntegerField(blank=True, null=True)
doaj_seal = models.BooleanField(default=False)
doaj_status = models.BooleanField(default=False)
lockss = models.BooleanField(default=False)
nlch = models.BooleanField(default=False)
portico = models.BooleanField(default=False)
qoam_av_score = models.DecimalField(decimal_places=2, max_digits=5, blank=True, null=True)
def __str__(self):
return f"{self.name} from {self.website}"
class Meta:
ordering = ('name',)
class Organization(models.Model):
""" Organizations: one of the big entities in the application, organizations (research institutions or funders) who employ or fund the authors/researchers
:param name: name of the organization
:type name: str
:param website: web site of the organization
:type website: URL, optional
:param country: zero or more home countries
:type country: many-to-many relationship with class `Country`
:param ror: Research Organization Registry (ROR) indentifier https://ror.org/
:type ror: str, optional
:param fundref: Crossref Funder Registry identifier https://www.crossref.org/services/funder-registry/
:type fundref: str, optional
:param starting_year: founding year
:type starting_year: int, optional
:param is_funder: if True, the organization is a funding agency, if False a research organization
:type is_funder: bool
:param ir_name: name of the oeganization's institutional repository for publications, if applicable
:type ir_name: str, optional
:param ir_url: address of the oeganization's institutional repository for publications, if applicable
:type ir_name: URL, optional
"""
name = models.CharField(verbose_name="Organization name", max_length=600, null=True)
website = models.URLField(max_length=600, blank=True, null=True)
country = models.ManyToManyField("Country")
ror = models.CharField(max_length=255, blank=True, null=True)
fundref = models.CharField(max_length=255, blank=True, null=True)
starting_year = models.IntegerField(blank=True, null=True)
is_funder = models.BooleanField(default=False)
ir_name = models.CharField(verbose_name="Institutional repository name", max_length=40, null=True, blank=True)
ir_url = models.URLField(verbose_name="Institutional repository URL", max_length=100, null=True, blank=True)
def __str__(self):
return f"{self.name}"
class Meta:
ordering = ('name',)
class Version(models.Model):
""" Possible versions of an article during its life cycle: submitted version, accepted version, published version
:param name: name of the version
:type name: str
"""
description = models.CharField(max_length=300, null=False)
def __str__(self):
return f"{self.description}"
class Licence(models.Model):
""" Licenses that can or must be applied to an article version
:param name_or_abbrev: name or abbreviation for the license: copyright, CC-BY,...
:type name_or_abbrev: str
:param website: web page that describes the license terms
:type website: URL, optional
"""
name_or_abbrev = models.CharField(max_length=300, null=False)
website = models.URLField(max_length=600, null=True, blank=True)
class Meta:
ordering = ('name_or_abbrev',)
def __str__(self):
return f"{self.name_or_abbrev}"
class Cost_factor_type(models.Model):
""" Cost factor types: amount, discount...
:param name: name of the type
:type name: str
"""
name = models.CharField(max_length=300, null=False)
def __str__(self):
return f"{self.name}"
class Cost_factor(models.Model):
""" Cost factors: financial terms applicable to use an Open Access option
:param cost_factor_type: type of the cost factor
:type ost_factor_type: reference to a `Cost_factor` object
:param amount: actual cost or discount
:type amount: int
:param symbol: currency code or %
:type symbol: str
:param comment: extra information in free text
:type comment: str, optionaé
"""
cost_factor_type = models.ForeignKey(Cost_factor_type, on_delete=models.CASCADE, blank=True, null=True)
amount = models.IntegerField(null=False)
symbol = models.CharField(max_length=10, null=False)
comment = models.CharField(max_length=120, default="")
class Meta:
ordering = ('amount',)
def __str__(self):
return f"{self.id} - {self.amount} {self.symbol} - {self.comment}"
class Term(models.Model):
""" Terms: possible options to disseminate an article in Open Access
:param version: zero or more versions for which the Term is applicable (currently only 1 is supported by the application)
:type version: many-to-many relationship to the `Version` class
:param cost_factor: zero or more possible cost factors
:type cost_factor: many-to-many relationship to the `Cost_factor` class
:param licence: zero or more possible licenses
:type licence: many-to-many relationship to the `Licence` class
:param embargo_months: duration of a possible embargo in months
:type embargo_months: int
:param ir_archiving: is archiving in an institutional repository allowed/required or not?
:type ir_archiving: bool
:param comment: extra information as free text
:type comment: str, optional
"""
version = models.ManyToManyField(Version)
cost_factor = models.ManyToManyField(Cost_factor)
licence = models.ManyToManyField(Licence)
embargo_months = models.IntegerField(blank=True, null=True)
ir_archiving = models.BooleanField(default=False)
comment = models.CharField(max_length=1000, null=True, blank=True)
def __str__(self):
try:
# Maybe these fields should not allow NULL values?
if self.embargo_months is None:
embargo = 'no_'
else:
embargo = str(self.embargo_months)
if self.comment is None:
comment = ''
else:
comment = str(self.comment)
term_data = (str(self.id),
';'.join([str(x) for x in self.version.all()]),
';'.join([str(x) for x in self.licence.all()]),
';'.join([str(x) for x in self.cost_factor.all()]),
f'Archiving{str(self.ir_archiving)} {embargo}months',
comment,)
return ' - '.join(term_data)
except RecursionError:
# The JSON import in the admin module somehow throws a ValueError during the loading process
# probably due to incomplete information in the many2many relationships
# Then the error log apparently triggers a cascade of errors until
# the RecursionError level is hit. Falling back to a basic __str__
# for the RecursionError seems to bypass the problem.
return f"[Term.__str__() error] {self.id} - {self.comment}"
class Meta:
ordering = ('-ir_archiving', 'embargo_months', 'comment')
class ConditionType(models.Model):
""" Condition types: issued by a journal, by an organization, or agreement between both?
:param condition_issuer: `organization-only`, `agreement` or `journal-only`
:type condition_issuer: str
"""
condition_issuer = models.CharField(max_length=300, null=False)
def __str__(self):
return f"{self.condition_issuer}"
class ConditionSubType(models.Model):
""" Condition subtypes: in case we need to distinguish more finely than the 3 main condition types
:param label: name of the subtype
:type label_issuer: str
"""
label = models.CharField(max_length=300, null=False)
def __str__(self):
return f"{self.label}"
@classmethod
def get_default_pk(cls):
""" An automatic subtype is attributed to any newly created `CondtionSet` object
"""
condition_subtype, created = cls.objects.get_or_create(label='Automatic')
return condition_subtype.pk
class ConditionSet(models.Model):
""" Condition sets: collections of Open Access terms applicable to zero or more Journals and zero or more Organizations
for some specific reason (policy document, agreement, contract...).
:param condition_type: type for the condition set
:type condition_type: reference to a `ConditionType` object
:param subtype: subtype for the condition set
:type subtype: eference to a `ConditionSubType` object
:param organization: zero or more organisations to which the condition set is applicable
:type organization: many-to-many relationship with the `Organization` class with `OrganizationCondition` objects as connectors
:param journal: zero or more journals to which the condition set is applicable
:type journal: many-to-many relationship with the `Journal` class with `JournalCondition` objects as connectors
:param term: zero or more terms included in the condition set
:type term: many-to-many relationship with the `Term` class
:param source: web page with information about the condition set (origin, perimeter, etc.)
:type source: URL, optional
:param comment: description of the condition set as free text (will be used as a title in the frontend)
:type comment: str, optional
"""
condition_type = models.ForeignKey(ConditionType, on_delete=models.CASCADE, blank=True, null=True)
subtype = models.ForeignKey(ConditionSubType, on_delete=models.CASCADE,
default=ConditionSubType.get_default_pk, null=True)
organization = models.ManyToManyField(
Organization,
through='OrganizationCondition',
through_fields=('condition_set', 'organization')
)
journal = models.ManyToManyField(
Journal,
through='JournalCondition',
through_fields=('condition_set', 'journal')
)
term = models.ManyToManyField(Term)
source = models.URLField(max_length=600, null=True, blank=True)
comment = models.CharField(max_length=100, null=True, blank=True)
def __str__(self):
return f"{self.id} {self.condition_type}|{self.comment}"
class Meta:
# TODO does this work??? 2ndary sort showing institution first, funder second
# ordering = ('condition_type__pk', 'organization__is_funder', 'subtype__id', 'comment')
# No it does not, it duplicates most journal policies (one copy for funders, one for institutions)
ordering = ('condition_type__pk', 'subtype__id', 'comment')
class OrganizationCondition(models.Model):
""" Organization-ConditionSet connector, linking `organization` with
`condition`. The first (`valid_from`) and last (`valid_until`) known days of validity are recorded.
"""
organization = models.ForeignKey(Organization, on_delete=models.CASCADE, blank=True, null=True)
condition_set = models.ForeignKey(ConditionSet, on_delete=models.CASCADE, blank=True, null=True)
valid_from = models.DateField(blank=True, null=True)
valid_until = models.DateField(blank=True, null=True)
class Meta:
verbose_name = "Organization/condition_set relationship"
def __str__(self):
return f"{self.id} {self.organization.name}/ConditionSet {self.condition_set.id}"
class JournalCondition(models.Model):
""" Journal-ConditionSet connector, linking `journal` with
`condition`. The first (`valid_from`) and last (`valid_until`) known days of validity are recorded.
"""
journal = models.ForeignKey(Journal, on_delete=models.CASCADE, blank=True, null=True)
condition_set = models.ForeignKey(ConditionSet, on_delete=models.CASCADE, blank=True, null=True)
valid_from = models.DateField(blank=True, null=True)
valid_until = models.DateField(blank=True, null=True)
class Meta:
verbose_name = "Journal/condition_set relationship"
def __str__(self):
return f"{self.id} {self.journal.name}/{self.condition_set}"
diff --git a/django_api/serializers.py b/django_api/serializers.py
index 880944f6..10605fd5 100644
--- a/django_api/serializers.py
+++ b/django_api/serializers.py
@@ -1,311 +1,328 @@
+"""
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+"""
+
""" REST API serializers
All serializers inherit from WritableNestedModelSerializer to allow writing nested objects
through the API as per https://github.com/beda-software/drf-writable-nested
and RQLMixin to support the Resource Query Language (RQL) https://django-rql.readthedocs.io/en/latest/
"""
from rest_framework import serializers
from dj_rql.drf.serializers import RQLMixin
from .models import *
from drf_writable_nested.serializers import WritableNestedModelSerializer
class CountrySerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for Countries
"""
id = serializers.IntegerField(required=False)
name = serializers.CharField(required=False)
iso_code = serializers.CharField(required=False)
class Meta:
model = Country
fields = '__all__'
depth = 4
class LanguageSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for Languages
"""
id = serializers.IntegerField(required=False)
name = serializers.CharField(required=False)
iso_code = serializers.CharField(required=False)
class Meta:
model = Language
fields = '__all__'
depth = 4
class PublisherSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for Publishers
"""
id = serializers.IntegerField(required=False)
country = CountrySerializer(required=False, many=True)
class Meta:
model = Publisher
fields = '__all__'
depth = 4
class OaSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for OA statuses
"""
id = serializers.IntegerField(required=False, allow_null=True)
description = serializers.CharField(required=False, allow_null=True)
subscription = serializers.BooleanField(required=False)
accepted_manuscript = serializers.BooleanField(required=False)
apc = serializers.BooleanField(required=False)
final_version = serializers.BooleanField(required=False)
class Meta:
model = Oa
fields = '__all__'
depth = 4
class IssnSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for ISSNs
"""
id = serializers.IntegerField(required=False)
class Meta:
model = Issn
fields = '__all__'
depth = 1
class JournalSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for Journals
"""
id = serializers.IntegerField(required=False)
issn = IssnSerializer(required=False, source='classIssn', many=True)
publisher = PublisherSerializer(required=False, many=True)
language = LanguageSerializer(required=False, many=True)
# allow update via post request --> "oa_status": {2},
# oa_status = serializers.PrimaryKeyRelatedField(queryset=Oa.objects.all())
oa_status = OaSerializer(required=False,allow_null=True)
class Meta:
model = Journal
fields = '__all__'
depth = 4
class LicenceSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for Licences
"""
id = serializers.IntegerField(required=False)
name_or_abbrev = serializers.CharField()
website = serializers.URLField(allow_null=True, required=False)
class Meta:
model = Licence
fields = '__all__'
depth = 4
class Cost_factor_typeSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for cost factor types
"""
id = serializers.IntegerField(required=False)
name = serializers.CharField()
class Meta:
model = Cost_factor_type
fields = '__all__'
depth = 4
class VersionSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for article versions
"""
id = serializers.IntegerField(required=False)
description = models.CharField()
class Meta:
model = Version
fields = '__all__'
depth = 4
class OrgaSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for organizations
"""
id = serializers.IntegerField(required=False)
country = CountrySerializer(required=False, many=True)
class Meta:
model = Organization
fields = '__all__'
depth = 4
class Cost_factorSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for cost factors
"""
id = serializers.IntegerField(required=False)
cost_factor_type = Cost_factor_typeSerializer(required=False, allow_null=True)
amount = serializers.IntegerField()
symbol = serializers.CharField()
comment = serializers.CharField(required=False)
class Meta:
model = Cost_factor
fields = '__all__'
depth = 4
class TermSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for terms
"""
id = serializers.IntegerField(required=False)
version = VersionSerializer(required=False, many=True)
cost_factor = Cost_factorSerializer(required=False, many=True)
licence = LicenceSerializer(required=False, many=True)
class Meta:
model = Term
fields = '__all__'
depth = 4
class ConditionTypeSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for condition types
"""
id = serializers.IntegerField(required=False)
condition_issuer = serializers.CharField()
class Meta:
model = ConditionType
fields = '__all__'
depth = 4
class ConditionSubTypeSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for condition subtypes
"""
id = serializers.IntegerField(required=False)
label = serializers.CharField()
class Meta:
model = ConditionSubType
fields = '__all__'
depth = 4
class ConditionSetSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for condition sets
"""
id = serializers.IntegerField(required=False)
term = TermSerializer(many=True, read_only=False)
condition_type = ConditionTypeSerializer(read_only=False)
subtype = ConditionSubTypeSerializer(read_only=False)
organization = OrgaSerializer(many=True, read_only=False)
journal = JournalSerializer(many=True, read_only=False)
comment = serializers.CharField(read_only=False)
source = serializers.URLField(read_only=False)
class Meta:
model = ConditionSet
# pre filter for rql
# fields = ['id','condition_type','term','journal','organization']
# add for informations purpose
fields = '__all__'
depth = 4
class JournalIdSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API light-weight serializer for journals, using only the ID.
Used by the frontend when building the query
"""
id = serializers.IntegerField(required=False)
# allow update via post request --> "oa_status": {2},
class Meta:
model = Journal
fields = ['id']
class ConditionSetLightSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API serializer for condition sets, providing only the information
needed in the frontend to improve performance
"""
id = serializers.IntegerField(required=False)
term = TermSerializer(many=True, read_only=False)
condition_type = ConditionTypeSerializer(read_only=False)
subtype = ConditionSubTypeSerializer(read_only=False)
organization = OrgaSerializer(many=True, read_only=False)
# No journals in this one.
journal = JournalIdSerializer(many=True, read_only=False)
comment = serializers.CharField(read_only=False)
source = serializers.URLField(read_only=False)
class Meta:
model = ConditionSet
# pre filter for rql
# fields = ['id','condition_type','term','journal','organization']
# add for informations purpose
fields = ['id', 'condition_type', 'subtype', 'term', 'organization', 'journal', 'comment', 'source']
depth = 4
class JournalLightSerializer(WritableNestedModelSerializer, RQLMixin):
""" REST API lighter serializer for journals
"""
id = serializers.IntegerField(required=False)
# allow update via post request --> "oa_status": {2},
oa_status = serializers.PrimaryKeyRelatedField(queryset=Oa.objects.all())
language = serializers.PrimaryKeyRelatedField(queryset=Language.objects.all(), many=True)
publisher = serializers.PrimaryKeyRelatedField(queryset=Publisher.objects.all(), many=True)
starting_year = serializers.IntegerField(required=False)
end_year = serializers.IntegerField(required=False)
class Meta:
model = Journal
fields = ['id', 'name', 'oa_status', 'language', 'publisher', 'starting_year', 'end_year']
depth = 1
class OaSerializer(WritableNestedModelSerializer,RQLMixin):
""" REST API serializers for OA statuses
"""
id = serializers.IntegerField(required=False)
status = serializers.CharField(allow_null=True)
description = serializers.CharField(allow_null=True)
subscription = serializers.BooleanField(required=False)
accepted_manuscript = serializers.BooleanField(required=False)
apc = serializers.BooleanField(required=False)
final_version = serializers.BooleanField(required=False)
class Meta:
model = Oa
fields = '__all__'
depth = 4
class OrganizationConditionSerializer(serializers.ModelSerializer, RQLMixin):
""" REST API serializers for Organisation-condition connections
"""
id = serializers.IntegerField(required=False)
organization = OrgaSerializer(required=False)
condition_set = ConditionSetSerializer(required=False)
class Meta:
model = OrganizationCondition
fields = '__all__'
depth = 4
class JournalConditionSerializer(serializers.ModelSerializer, RQLMixin):
""" REST API serializers for Organisation-condition connections
"""
id = serializers.IntegerField(required=False)
journal = JournalSerializer(required=False)
condition_set = ConditionSetSerializer(required=False)
class Meta:
model = JournalCondition
fields = '__all__'
depth = 4
diff --git a/django_api/tests.py b/django_api/tests.py
index 7ce503c2..f8f7dc4c 100644
--- a/django_api/tests.py
+++ b/django_api/tests.py
@@ -1,3 +1,20 @@
+"""
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+"""
+
from django.test import TestCase
# Create your tests here.
diff --git a/django_api/urls.py b/django_api/urls.py
index 2c99628e..996883b6 100644
--- a/django_api/urls.py
+++ b/django_api/urls.py
@@ -1,39 +1,56 @@
+"""
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+"""
+
from django.urls import path, re_path, include
from django.conf.urls.static import static
from django.conf import settings
from .views import *
from rest_framework import routers
from rest_framework.schemas import get_schema_view
router = routers.DefaultRouter()
router.register(r'journal', JournalViewSet)
router.register(r'journal_light', JournalLightViewSet)
router.register(r'organization', OrgaViewSet)
router.register(r'funder', FunderViewSet)
router.register(r'conditionset', ConditionSetViewSet)
router.register(r'conditionset_light', ConditionSetLightViewSet)
router.register(r'term', TermViewSet)
# show table details in the API
router.register(r'country', CountryViewSet)
router.register(r'language', LanguageViewSet)
router.register(r'issn', IssnViewSet)
router.register(r'oa', OaViewSet)
router.register(r'publisher', PublisherViewSet)
router.register(r'version', VersionViewSet)
router.register(r'licence', LicenceViewSet)
router.register(r'cost_factor_type', Cost_factor_typeViewSet)
router.register(r'cost_factor', Cost_factorViewSet)
router.register(r'conditiontype', ConditionTypeViewSet)
router.register(r'JournalCondition', JournalConditionViewSet)
router.register(r'organizationCondition', OrganizationConditionViewSet)
urlpatterns = [
path('', include(router.urls)),
path('openapi', get_schema_view(
- title="OACCT API",
- description="API of the Open Access Compliance Check Tool (OACCT)",
- version ="0.9"
+ title="OACT API",
+ description="API of the Open Access Check Tool (OACT)",
+ version ="1.0"
), name='openapi-schema'),
]
diff --git a/django_api/views.py b/django_api/views.py
index d43d690f..fe1c38b3 100644
--- a/django_api/views.py
+++ b/django_api/views.py
@@ -1,304 +1,321 @@
+"""
+This is the Open Access Check Tool (OACT).
+The publication of scientific articles as Open Access (OA), usually in the variants "Green OA" and "Gold OA", allows free access to scientific research results and their largely unhindered dissemination. Often, however, the multitude of available publication conditions makes the decision in favor of a particular journal difficult: requirements of the funding agencies and publication guidelines of the universities and colleges must be carefully compared with the offers of the publishing houses, and separately concluded publication agreements can also offer additional benefits. The "OA Compliance Check Tool" provides a comprehensive overview of the possible publication conditions for a large number of journals, especially for the Swiss university landscape, and thus supports the decision-making process.
+
+© All rights reserved. ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, Scientific Information and Libraries, 2022
+
+See LICENSE.TXT for more details.
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License along with this program. If not, see
+
.
+
+"""
+
from django.contrib.auth.models import AbstractUser
from django.shortcuts import render
from django.contrib.auth import authenticate, login, logout
from django.shortcuts import render
from django.http import HttpResponse, HttpResponseRedirect, Http404, JsonResponse
from .models import *
from .serializers import *
from rest_framework import viewsets, filters, generics
from django.utils.decorators import method_decorator
from django.views.decorators.cache import cache_page
from rest_framework.authentication import BasicAuthentication
from rest_framework.permissions import IsAuthenticatedOrReadOnly
from rest_framework import status
from rest_framework.decorators import api_view
from rest_framework.response import Response
from rest_framework_tracking.mixins import LoggingMixin
from itertools import chain
from django.db.models import Count, Max
from dj_rql.filter_cls import RQLFilterClass
from urllib.parse import unquote
from datetime import date
import ipaddress
class JournalViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
search_fields = ['name']
filter_backends = (filters.SearchFilter,)
queryset = Journal.objects.all()
serializer_class = JournalSerializer
class JournalLightViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
search_fields = ['name']
filter_backends = (filters.SearchFilter,)
queryset = Journal.objects.all().prefetch_related('publisher', 'language', 'oa_status')
serializer_class = JournalLightSerializer
@method_decorator(cache_page(4 * 60 * 60))
def dispatch(self, request, *args, **kwargs):
# print('dispatch() called')
return super().dispatch(request, *args, **kwargs)
@method_decorator(cache_page(4 * 60 * 60))
def list(self, request):
# print('list() called')
serializer = self.serializer_class(self.queryset, many=True)
return Response(serializer.data)
class OrgaViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = OrgaSerializer
queryset = Organization.objects.filter(
is_funder=False
)
class ConditionSetFilters(RQLFilterClass):
""" API filters for the essential query on ConditionSets
Arguments can include a journal id, zero to two organization ids,
validity dates and a condition type.
Request examples:
http://127.0.0.1:8000/api/conditionset/?and(eq(journalcondition.journal.id,3),eq(organizationcondition.organization.id,11),eq(condition_type.id,1))
http://127.0.0.1:8000/api/conditionset/?and(eq(journalcondition.journal.id,14),ne(condition_type.id,2),ge(journalcondition.valid_until,2021-08-20),le(journalcondition.valid_from,2021-08-20),ge(organizationcondition.valid_until,2021-08-20),le(organizationcondition.valid_from,2021-08-20))
"""
MODEL = ConditionSet
DISTINCT = True
FILTERS = (
'id',
{
'namespace': 'journalcondition',
'filters': ['id', 'valid_from', 'valid_until',
{
'namespace': 'journal',
'filters': ['id', ],
}
],
},
{
'namespace': 'organizationcondition',
'filters': ['id', 'valid_from', 'valid_until',
{
'namespace': 'organization',
'filters': ['id', ]
}
],
},
{
'namespace': 'condition_type',
'filters': ['id', ],
},
)
class MyLoggingMixin(LoggingMixin):
"""
Supercharge drf_tracking.LoggingMixin to get the real IP address in the OpenShift infrastructure
"""
def _get_ip_address(self, request):
"""Get the remote ip address the request was generated from."""
- print(request.META)
- ipaddr = request.META.get("HTTP_X_FORWARDED_FOR", None)
+ # print(request.META)
+ ipaddr = request.META.get("X-Real-IP", None)
if ipaddr:
ipaddr = ipaddr.split(",")[0]
else:
- ipaddr = request.META.get("HTTP_X_REAL_IP", None)
+ ipaddr = request.META.get("HTTP_X_FORWARDED_FOR", None)
if ipaddr:
ipaddr = ipaddr.split(",")[0]
else:
ipaddr = request.META.get("REMOTE_ADDR", "").split(",")[0]
# Account for IPv4 and IPv6 addresses, each possibly with port appended. Possibilities are:
#
#
# :port
# []:port
# Note that ipv6 addresses are colon separated hex numbers
possibles = (ipaddr.lstrip("[").split("]")[0], ipaddr.split(":")[0])
for addr in possibles:
try:
return str(ipaddress.ip_address(addr))
except ValueError:
pass
return ipaddr
class ConditionSetViewSet(MyLoggingMixin, viewsets.ModelViewSet):
""" ViewSet for ConditionSets
The QuerySet obtained from the database is annotated to obtain the desired sorting order,
i.e. by condition_type, then subtype, then a calculated score so that institutions receive
more attention than funders within a given type/subtype
"""
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
queryset = ConditionSet.objects.all().annotate(include_funder=Max('organization__is_funder')).order_by('condition_type','subtype', 'include_funder','comment')
# queryset = ConditionSet.objects.values('term__version__description')
serializer_class = ConditionSetSerializer
# serializer_class = ConditionGroupedSerializer
rql_filter_class = ConditionSetFilters
#.objects.values('term__version.description')
class ConditionSetLightViewSet(MyLoggingMixin, viewsets.ModelViewSet):
""" Light-weight ViewSet for ConditionSets
The QuerySet obtained from the database is annotated to obtain the desired sorting order,
i.e. by condition_type, then subtype, then a calculated score so that institutions receive
more attention than funders within a given type/subtype
"""
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
queryset = ConditionSet.objects.all().annotate(include_funder=Max('organization__is_funder')).order_by('condition_type','subtype','include_funder','comment')
serializer_class = ConditionSetLightSerializer
rql_filter_class = ConditionSetFilters
class FunderViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = OrgaSerializer
queryset = Organization.objects.filter(
is_funder=True
)
class TermViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = TermSerializer
queryset = Term.objects.all()
class CountryViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = CountrySerializer
queryset = Country.objects.all()
class LanguageViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = LanguageSerializer
queryset = Language.objects.all()
class IssnViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = IssnSerializer
queryset = Issn.objects.all()
class OaViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = OaSerializer
queryset = Oa.objects.all()
class PublisherViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = PublisherSerializer
queryset = Publisher.objects.all()
class VersionViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = VersionSerializer
queryset = Version.objects.all()
class LicenceViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = LicenceSerializer
queryset = Licence.objects.all()
class Cost_factor_typeViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = Cost_factor_typeSerializer
queryset = Cost_factor_type.objects.all()
class Cost_factorViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = Cost_factorSerializer
queryset = Cost_factor.objects.all()
class ConditionTypeViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = ConditionTypeSerializer
queryset = ConditionType.objects.all()
class OrganizationConditionViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = OrganizationConditionSerializer
queryset = OrganizationCondition.objects.all()
class JournalConditionViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = JournalConditionSerializer
queryset = JournalCondition.objects.all()
class OrganizationConditionViewSet(viewsets.ModelViewSet):
authentification_classes = (BasicAuthentication,)
permission_classes = [IsAuthenticatedOrReadOnly]
serializer_class = OrganizationConditionSerializer
queryset = OrganizationCondition.objects.all()
# Count number of different version
# OrganizationCondition.objects.annotate(version_count=Count('condition_set__term__version'))
# OrganizationCondition.objects
# .values('condition_set__term__version') #what to group by
# .annotate(version_count=Count('condition_set__term__version')) # what to aggregate
# group by version and count
# OrganizationCondition.objects.values('condition_set__term__version').annotate(version_count=Count('condition_set__term__version'))
# source https://hakibenita.com/django-group-by-sql
# https://docs.djangoproject.com/en/3.2/topics/db/aggregation/
# OrganizationCondition.objects.values('condition_set__term__version').filter(organization_id=1).annotate(version_count=Count('condition_set__term__version'))
\ No newline at end of file
diff --git a/django_app/settings.py b/django_app/settings.py
index 86c246da..a24ffdc0 100644
--- a/django_app/settings.py
+++ b/django_app/settings.py
@@ -1,176 +1,182 @@
"""
Django settings for django_api project.
Generated by 'django-admin startproject' using Django 3.1.3.
For more information on this file, see
https://docs.djangoproject.com/en/3.1/topics/settings/
For the full list of settings and their values, see
https://docs.djangoproject.com/en/3.1/ref/settings/
"""
import os
# Build paths inside the project like this: BASE_DIR / 'subdir'.
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
# Quick-start development settings - unsuitable for production
# See https://docs.djangoproject.com/en/3.1/howto/deployment/checklist/
# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY = 'SECRET_KEY'
# SECURITY WARNING: don't run with debug turned on in production!
DEBUG = True
+# In case something goes wrong while trying to run with DEBUG = False
+DEBUG_PROPAGATE_EXCEPTIONS = True
+
ALLOWED_HOSTS = ['0.0.0.0',
'127.0.0.1',
]
INTERNAL_IPS = [
# ...
'127.0.0.1',
# ...
]
# Application definition
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'import_export',
'django_api',
'corsheaders',
'rest_framework',
'rql_filter',
'django_extensions',
- 'debug_toolbar',
+ #'debug_toolbar',
'rest_framework_tracking',
# Not necessary at this point but let's keep it as a possible idea
# 'inline_actions',
]
MIDDLEWARE = [
#'debug_toolbar.middleware.DebugToolbarMiddleware',
'django.middleware.security.SecurityMiddleware',
'whitenoise.middleware.WhiteNoiseMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'corsheaders.middleware.CorsMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]
ROOT_URLCONF = 'django_app.urls'
CORS_ORIGIN_ALLOW_ALL = False
TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS': [
os.path.join(BASE_DIR, 'templates'),
os.path.join(BASE_DIR, 'staticfiles')
],
'APP_DIRS': True,
'OPTIONS': {
'context_processors': [
'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.contrib.auth.context_processors.auth',
'django.contrib.messages.context_processors.messages',
],
},
},
]
TEMPLATE_LOADERS = (
('django.template.loaders.cached.Loader', (
'django.template.loaders.filesystem.Loader',
'django.template.loaders.app_directories.Loader',
)),
)
WSGI_APPLICATION = 'django_app.wsgi.application'
# Database
# https://docs.djangoproject.com/en/3.1/ref/settings/#databases
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
}
}
# Password validation
# https://docs.djangoproject.com/en/3.1/ref/settings/#auth-password-validators
AUTH_PASSWORD_VALIDATORS = [
{
'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
},
]
# Internationalization
# https://docs.djangoproject.com/en/3.1/topics/i18n/
LANGUAGE_CODE = 'en-us'
TIME_ZONE = 'Europe/Zurich'
USE_I18N = True
USE_L10N = True
USE_TZ = True
# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/3.1/howto/static-files/
STATIC_URL = '/static/'
STATICFILES_DIRS = [
os.path.join(BASE_DIR, 'static'),
]
# static file for production use only
STATIC_ROOT = os.path.join(BASE_DIR, 'staticfiles')
STATICFILES_STORAGE = 'whitenoise.storage.CompressedManifestStaticFilesStorage'
REST_FRAMEWORK = {
'DEFAULT_FILTER_BACKENDS': ['dj_rql.drf.RQLFilterBackend']
}
# will this be enough to set 2 validity dates for all journals?
DATA_UPLOAD_MAX_NUMBER_FIELDS = 40000
# Provide Django 3.2 with a safe default
DEFAULT_AUTO_FIELD = 'django.db.models.AutoField'
# drf-tracking should not expose editable log entries
-DRF_TRACKING_ADMIN_LOG_READONLY=True
+DRF_TRACKING_ADMIN_LOG_READONLY = True
+
+SHELL_PLUS_PRINT_SQL_TRUNCATE = None
+RUNSERVER_PLUS_PRINT_SQL_TRUNCATE = None
diff --git a/import_scripts/01_oacct_countries.md b/import_scripts/01_oacct_countries.md
deleted file mode 100644
index 1aa14a4e..00000000
--- a/import_scripts/01_oacct_countries.md
+++ /dev/null
@@ -1,587 +0,0 @@
-# Projet Open Access Compliance Check Tool (OACCT)
-
-Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-
-Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-
-Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-Date de dernière mise à jour : 16.07.2021
-
-
-```python
-import pandas as pd
-import csv
-import json
-import numpy as np
-```
-
-## Table Countries
-
-
-```python
-# La table a été corrigée pour ajouter la valeur manquante à la fin :
-# International Agency International Agency OI INT 999
-country = pd.read_csv('iso_3166.txt', encoding='utf-8', header=0, sep='\t', na_filter=False)
-country
-```
-
-
-
-
-
-
-
-
-
-
- English short name
- French short name
- Alpha-2 code
- Alpha-3 code
- Numeric
-
-
-
-
- 0
- Afghanistan
- Afghanistan (l')
- AF
- AFG
- 4
-
-
- 1
- Albania
- Albanie (l')
- AL
- ALB
- 8
-
-
- 2
- Algeria
- Algérie (l')
- DZ
- DZA
- 12
-
-
- 3
- American Samoa
- Samoa américaines (les)
- AS
- ASM
- 16
-
-
- 4
- Andorra
- Andorre (l')
- AD
- AND
- 20
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 245
- Yemen
- Yémen (le)
- YE
- YEM
- 887
-
-
- 246
- Zambia
- Zambie (la)
- ZM
- ZMB
- 894
-
-
- 247
- Zimbabwe
- Zimbabwe (le)
- ZW
- ZWE
- 716
-
-
- 248
- Åland Islands
- Åland(les Îles)
- AX
- ALA
- 248
-
-
- 249
- International Agency
- International Agency
- OI
- INT
- 999
-
-
-
-
250 rows × 5 columns
-
-
-
-
-
-```python
-country.loc[country['Alpha-2 code'].isnull()]
-```
-
-
-
-
-
-
-
-
-
-
- English short name
- French short name
- Alpha-2 code
- Alpha-3 code
- Numeric
-
-
-
-
-
-
-
-
-
-
-```python
-# convertir l'index en id
-country = country.reset_index()
-country
-```
-
-
-
-
-
-
-
-
-
-
- index
- English short name
- French short name
- Alpha-2 code
- Alpha-3 code
- Numeric
-
-
-
-
- 0
- 0
- Afghanistan
- Afghanistan (l')
- AF
- AFG
- 4
-
-
- 1
- 1
- Albania
- Albanie (l')
- AL
- ALB
- 8
-
-
- 2
- 2
- Algeria
- Algérie (l')
- DZ
- DZA
- 12
-
-
- 3
- 3
- American Samoa
- Samoa américaines (les)
- AS
- ASM
- 16
-
-
- 4
- 4
- Andorra
- Andorre (l')
- AD
- AND
- 20
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 245
- 245
- Yemen
- Yémen (le)
- YE
- YEM
- 887
-
-
- 246
- 246
- Zambia
- Zambie (la)
- ZM
- ZMB
- 894
-
-
- 247
- 247
- Zimbabwe
- Zimbabwe (le)
- ZW
- ZWE
- 716
-
-
- 248
- 248
- Åland Islands
- Åland(les Îles)
- AX
- ALA
- 248
-
-
- 249
- 249
- International Agency
- International Agency
- OI
- INT
- 999
-
-
-
-
250 rows × 6 columns
-
-
-
-
-
-```python
-country['id'] = country['index'] + 1
-del country['index']
-del country['French short name']
-del country['Alpha-3 code']
-del country['Numeric']
-country
-```
-
-
-
-
-
-
-
-
-
-
- English short name
- Alpha-2 code
- id
-
-
-
-
- 0
- Afghanistan
- AF
- 1
-
-
- 1
- Albania
- AL
- 2
-
-
- 2
- Algeria
- DZ
- 3
-
-
- 3
- American Samoa
- AS
- 4
-
-
- 4
- Andorra
- AD
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 245
- Yemen
- YE
- 246
-
-
- 246
- Zambia
- ZM
- 247
-
-
- 247
- Zimbabwe
- ZW
- 248
-
-
- 248
- Åland Islands
- AX
- 249
-
-
- 249
- International Agency
- OI
- 250
-
-
-
-
250 rows × 3 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-country = country.rename(columns={'Alpha-2 code' : 'iso_code', 'English short name' : 'name'})
-```
-
-
-```python
-# ajout de la valeur UNKNOWN
-country = country.append({'id' : 999999, 'iso_code' : '__', 'name' : 'UNKNOWN'}, ignore_index=True)
-```
-
-
-```python
-country
-```
-
-
-
-
-
-
-
-
-
-
- name
- iso_code
- id
-
-
-
-
- 0
- Afghanistan
- AF
- 1
-
-
- 1
- Albania
- AL
- 2
-
-
- 2
- Algeria
- DZ
- 3
-
-
- 3
- American Samoa
- AS
- 4
-
-
- 4
- Andorra
- AD
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 246
- Zambia
- ZM
- 247
-
-
- 247
- Zimbabwe
- ZW
- 248
-
-
- 248
- Åland Islands
- AX
- 249
-
-
- 249
- International Agency
- OI
- 250
-
-
- 250
- UNKNOWN
- __
- 999999
-
-
-
-
251 rows × 3 columns
-
-
-
-
-
-```python
-# esport JSON
-result = country.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/country.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-country.to_csv('sample/country.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export csv
-country.to_csv('country.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-country.to_excel('sample/country.xlsx', index=False)
-```
diff --git a/import_scripts/01_oacct_countries.py b/import_scripts/01_oacct_countries.py
deleted file mode 100644
index 5f4ff631..00000000
--- a/import_scripts/01_oacct_countries.py
+++ /dev/null
@@ -1,107 +0,0 @@
-#!/usr/bin/env python
-# coding: utf-8
-
-# # Projet Open Access Compliance Check Tool (OACCT)
-#
-# Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-#
-# Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-#
-# Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-# Date de dernière mise à jour : 16.07.2021
-
-# In[1]:
-
-
-import pandas as pd
-import csv
-import json
-import numpy as np
-
-
-# ## Table Countries
-
-# In[2]:
-
-
-# La table a été corrigée pour ajouter la valeur manquante à la fin :
-# International Agency International Agency OI INT 999
-country = pd.read_csv('iso_3166.txt', encoding='utf-8', header=0, sep='\t', na_filter=False)
-country
-
-
-# In[3]:
-
-
-country.loc[country['Alpha-2 code'].isnull()]
-
-
-# In[4]:
-
-
-# convertir l'index en id
-country = country.reset_index()
-country
-
-
-# In[5]:
-
-
-country['id'] = country['index'] + 1
-del country['index']
-del country['French short name']
-del country['Alpha-3 code']
-del country['Numeric']
-country
-
-
-# In[6]:
-
-
-# renommer les colonnes
-country = country.rename(columns={'Alpha-2 code' : 'iso_code', 'English short name' : 'name'})
-
-
-# In[7]:
-
-
-# ajout de la valeur UNKNOWN
-country = country.append({'id' : 999999, 'iso_code' : '__', 'name' : 'UNKNOWN'}, ignore_index=True)
-
-
-# In[8]:
-
-
-country
-
-
-# In[9]:
-
-
-# esport JSON
-result = country.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/country.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[10]:
-
-
-# export csv
-country.to_csv('sample/country.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[11]:
-
-
-# export csv
-country.to_csv('country.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[12]:
-
-
-# export excel
-country.to_excel('sample/country.xlsx', index=False)
-
diff --git a/import_scripts/02_oacct_languages.md b/import_scripts/02_oacct_languages.md
deleted file mode 100644
index efcffbdd..00000000
--- a/import_scripts/02_oacct_languages.md
+++ /dev/null
@@ -1,694 +0,0 @@
-# Projet Open Access Compliance Check Tool (OACCT)
-
-Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-
-Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-
-Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-Date de dernière mise à jour : 16.07.2021
-
-
-```python
-import pandas as pd
-import csv
-import json
-import numpy as np
-```
-
-## Table Language
-
-
-```python
-# https://www.loc.gov/standards/iso639-2/php/code_list.php
-# ISO 639-2 Code ISO 639-1 Code English name of Language French name of Language German name of Language
-language = pd.read_csv('ISO-639-2_utf-8.txt', encoding='utf-8', header=None, sep='|', na_filter=False, names=['ISO 639-2 Code', 'ISO 639-1 Code', 'ignore', 'English name of Language', 'French name of Language'], index_col=False)
-language
-```
-
-
-
-
-
-
-
-
-
-
- ISO 639-2 Code
- ISO 639-1 Code
- ignore
- English name of Language
- French name of Language
-
-
-
-
- 0
- aar
-
- aa
- Afar
- afar
-
-
- 1
- abk
-
- ab
- Abkhazian
- abkhaze
-
-
- 2
- ace
-
-
- Achinese
- aceh
-
-
- 3
- ach
-
-
- Acoli
- acoli
-
-
- 4
- ada
-
-
- Adangme
- adangme
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 482
- znd
-
-
- Zande languages
- zandé, langues
-
-
- 483
- zul
-
- zu
- Zulu
- zoulou
-
-
- 484
- zun
-
-
- Zuni
- zuni
-
-
- 485
- zxx
-
-
- No linguistic content; Not applicable
- pas de contenu linguistique; non applicable
-
-
- 486
- zza
-
-
- Zaza; Dimili; Dimli; Kirdki; Kirmanjki; Zazaki
- zaza; dimili; dimli; kirdki; kirmanjki; zazaki
-
-
-
-
487 rows × 5 columns
-
-
-
-
-
-```python
-language.loc[language['ISO 639-2 Code'].isnull()]
-```
-
-
-
-
-
-
-
-
-
-
- ISO 639-2 Code
- ISO 639-1 Code
- ignore
- English name of Language
- French name of Language
-
-
-
-
-
-
-
-
-
-
-```python
-# convertir l'index en id
-language = language.reset_index()
-language
-```
-
-
-
-
-
-
-
-
-
-
- index
- ISO 639-2 Code
- ISO 639-1 Code
- ignore
- English name of Language
- French name of Language
-
-
-
-
- 0
- 0
- aar
-
- aa
- Afar
- afar
-
-
- 1
- 1
- abk
-
- ab
- Abkhazian
- abkhaze
-
-
- 2
- 2
- ace
-
-
- Achinese
- aceh
-
-
- 3
- 3
- ach
-
-
- Acoli
- acoli
-
-
- 4
- 4
- ada
-
-
- Adangme
- adangme
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 482
- 482
- znd
-
-
- Zande languages
- zandé, langues
-
-
- 483
- 483
- zul
-
- zu
- Zulu
- zoulou
-
-
- 484
- 484
- zun
-
-
- Zuni
- zuni
-
-
- 485
- 485
- zxx
-
-
- No linguistic content; Not applicable
- pas de contenu linguistique; non applicable
-
-
- 486
- 486
- zza
-
-
- Zaza; Dimili; Dimli; Kirdki; Kirmanjki; Zazaki
- zaza; dimili; dimli; kirdki; kirmanjki; zazaki
-
-
-
-
487 rows × 6 columns
-
-
-
-
-
-```python
-language['id'] = language['index'] + 1
-del language['index']
-del language['ignore']
-del language['French name of Language']
-del language['ISO 639-1 Code']
-language
-```
-
-
-
-
-
-
-
-
-
-
- ISO 639-2 Code
- English name of Language
- id
-
-
-
-
- 0
- aar
- Afar
- 1
-
-
- 1
- abk
- Abkhazian
- 2
-
-
- 2
- ace
- Achinese
- 3
-
-
- 3
- ach
- Acoli
- 4
-
-
- 4
- ada
- Adangme
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 482
- znd
- Zande languages
- 483
-
-
- 483
- zul
- Zulu
- 484
-
-
- 484
- zun
- Zuni
- 485
-
-
- 485
- zxx
- No linguistic content; Not applicable
- 486
-
-
- 486
- zza
- Zaza; Dimili; Dimli; Kirdki; Kirmanjki; Zazaki
- 487
-
-
-
-
487 rows × 3 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-language = language.rename(columns={'ISO 639-2 Code' : 'iso_code', 'English name of Language' : 'name'})
-```
-
-
-```python
-language
-```
-
-
-
-
-
-
-
-
-
-
- iso_code
- name
- id
-
-
-
-
- 0
- aar
- Afar
- 1
-
-
- 1
- abk
- Abkhazian
- 2
-
-
- 2
- ace
- Achinese
- 3
-
-
- 3
- ach
- Acoli
- 4
-
-
- 4
- ada
- Adangme
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 482
- znd
- Zande languages
- 483
-
-
- 483
- zul
- Zulu
- 484
-
-
- 484
- zun
- Zuni
- 485
-
-
- 485
- zxx
- No linguistic content; Not applicable
- 486
-
-
- 486
- zza
- Zaza; Dimili; Dimli; Kirdki; Kirmanjki; Zazaki
- 487
-
-
-
-
487 rows × 3 columns
-
-
-
-
-
-```python
-# corriger la valeur trop longue qaa-qtz
-language.loc[language['iso_code'] == 'qaa-qtz', 'iso_code'] = 'qaa'
-```
-
-
-```python
-# ajout de la valeur UNKNOWN
-language = language.append({'id' : 999999, 'iso_code' : '___', 'name' : 'UNKNOWN'}, ignore_index=True)
-language
-```
-
-
-
-
-
-
-
-
-
-
- iso_code
- name
- id
-
-
-
-
- 0
- aar
- Afar
- 1
-
-
- 1
- abk
- Abkhazian
- 2
-
-
- 2
- ace
- Achinese
- 3
-
-
- 3
- ach
- Acoli
- 4
-
-
- 4
- ada
- Adangme
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 483
- zul
- Zulu
- 484
-
-
- 484
- zun
- Zuni
- 485
-
-
- 485
- zxx
- No linguistic content; Not applicable
- 486
-
-
- 486
- zza
- Zaza; Dimili; Dimli; Kirdki; Kirmanjki; Zazaki
- 487
-
-
- 487
- ___
- UNKNOWN
- 999999
-
-
-
-
488 rows × 3 columns
-
-
-
-
-
-```python
-# esport JSON
-result = language.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/language.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-language.to_csv('language.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export csv
-language.to_csv('sample/language.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-language.to_excel('sample/language.xlsx', index=False)
-```
diff --git a/import_scripts/02_oacct_languages.py b/import_scripts/02_oacct_languages.py
deleted file mode 100644
index 7f859fdc..00000000
--- a/import_scripts/02_oacct_languages.py
+++ /dev/null
@@ -1,115 +0,0 @@
-#!/usr/bin/env python
-# coding: utf-8
-
-# # Projet Open Access Compliance Check Tool (OACCT)
-#
-# Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-#
-# Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-#
-# Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-# Date de dernière mise à jour : 16.07.2021
-
-# In[1]:
-
-
-import pandas as pd
-import csv
-import json
-import numpy as np
-
-
-# ## Table Language
-
-# In[2]:
-
-
-# https://www.loc.gov/standards/iso639-2/php/code_list.php
-# ISO 639-2 Code ISO 639-1 Code English name of Language French name of Language German name of Language
-language = pd.read_csv('ISO-639-2_utf-8.txt', encoding='utf-8', header=None, sep='|', na_filter=False, names=['ISO 639-2 Code', 'ISO 639-1 Code', 'ignore', 'English name of Language', 'French name of Language'], index_col=False)
-language
-
-
-# In[3]:
-
-
-language.loc[language['ISO 639-2 Code'].isnull()]
-
-
-# In[4]:
-
-
-# convertir l'index en id
-language = language.reset_index()
-language
-
-
-# In[5]:
-
-
-language['id'] = language['index'] + 1
-del language['index']
-del language['ignore']
-del language['French name of Language']
-del language['ISO 639-1 Code']
-language
-
-
-# In[6]:
-
-
-# renommer les colonnes
-language = language.rename(columns={'ISO 639-2 Code' : 'iso_code', 'English name of Language' : 'name'})
-
-
-# In[7]:
-
-
-language
-
-
-# In[8]:
-
-
-# corriger la valeur trop longue qaa-qtz
-language.loc[language['iso_code'] == 'qaa-qtz', 'iso_code'] = 'qaa'
-
-
-# In[9]:
-
-
-# ajout de la valeur UNKNOWN
-language = language.append({'id' : 999999, 'iso_code' : '___', 'name' : 'UNKNOWN'}, ignore_index=True)
-language
-
-
-# In[10]:
-
-
-# esport JSON
-result = language.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/language.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[11]:
-
-
-# export csv
-language.to_csv('language.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[12]:
-
-
-# export csv
-language.to_csv('sample/language.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[13]:
-
-
-# export excel
-language.to_excel('sample/language.xlsx', index=False)
-
diff --git a/import_scripts/03_oacct_journals.md b/import_scripts/03_oacct_journals.md
deleted file mode 100644
index b47e42df..00000000
--- a/import_scripts/03_oacct_journals.md
+++ /dev/null
@@ -1,17070 +0,0 @@
-# Projet Open Access Compliance Check Tool (OACCT)
-
-Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-
-Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-
-Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-Date de dernière mise à jour : 16.07.2021
-
-## Extraction des données des revues
-
-
-## Corpus initial
-
-ISSNs des revues des publication archivées sur l'AoU UNIGE et sur Infoscience EPFL
-
-* Fichier des ISSNs de l'AoU exporté le 16.10.2020
-* Fichier des ISSNs de Infoscience exporté le 28.01.2021
-* Données extraits à partir du JSON de ISSN.org
-
-
-
-```python
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-# paramètre pour le nombre de journaux dans le sample (0 pour prendre tout)
-journals_sample_n = 1000
-```
-
-## Table OA categories
-
-* 1 : UNKNOWN
-* 2 : Green
-* 3 : Hybrid
-* 4 : Full
-* 5 : Gold
-* 6 : Diamond
-
-
-```python
-# creation du DF
-col_names = ['id',
- 'status',
- 'description',
- 'subscription',
- 'accepted_manuscript',
- 'apc',
- 'final_version'
- ]
-oas = pd.DataFrame(columns = col_names)
-oas
-```
-
-
-
-
-
-
-
-
-
-
- id
- status
- description
- subscription
- accepted_manuscript
- apc
- final_version
-
-
-
-
-
-
-
-
-
-
-```python
-# ajout des valeurs
-oas = oas.append({'id' : 1, 'status' : 'UNKNOWN', 'description' : '', 'subscription' : 0, 'accepted_manuscript' : 0, 'apc' : 0, 'final_version' : 0}, ignore_index=True)
-oas = oas.append({'id' : 2, 'status' : 'Green', 'description' : 'Paywalled access journal, usually allows the archive of submitted or accepted version on institutional repositories (embargo periods may apply)', 'subscription' : 1, 'accepted_manuscript' : 1, 'apc' : 0, 'final_version' : 0}, ignore_index=True)
-oas = oas.append({'id' : 3, 'status' : 'hybrid', 'description' : 'Paywalled access journal, offers several Open Access upon payment of APCs. It allows offten the archive of published version on institutional repositories (embargo periods can apply)', 'subscription' : 1, 'accepted_manuscript' : 1, 'apc' : 1, 'final_version' : 1}, ignore_index=True)
-# oas = oas.append({'id' : 4, 'status' : 'Full', 'description' : 'No subscription, Green or Gold', 'subscription' : 0, 'accepted_manuscript' : 1, 'apc' : 0, 'final_version' : 1}, ignore_index=True)
-oas = oas.append({'id' : 5, 'status' : 'Gold', 'description' : 'Open Access journal (payment of APCs may apply). It allows offten the archive of published version on institutional repositories (embargo periods can apply)', 'subscription' : 0, 'accepted_manuscript' : 1, 'apc' : 1, 'final_version' : 1}, ignore_index=True)
-oas = oas.append({'id' : 6, 'status' : 'Diamond', 'description' : 'Open Access journal (without payment of APCs). It allows offten the archive of published version on institutional repositories (embargo periods can apply)', 'subscription' : 0, 'accepted_manuscript' : 1, 'apc' : 0, 'final_version' : 1}, ignore_index=True)
-```
-
-
-```python
-oas
-```
-
-
-
-
-
-
-
-
-
-
- id
- status
- description
- subscription
- accepted_manuscript
- apc
- final_version
-
-
-
-
- 0
- 1
- UNKNOWN
-
- 0
- 0
- 0
- 0
-
-
- 1
- 2
- Green
- Paywalled access journal, usually allows the a...
- 1
- 1
- 0
- 0
-
-
- 2
- 3
- hybrid
- Paywalled access journal, offers several Open ...
- 1
- 1
- 1
- 1
-
-
- 3
- 5
- Gold
- Open Access journal (payment of APCs may apply...
- 0
- 1
- 1
- 1
-
-
- 4
- 6
- Diamond
- Open Access journal (without payment of APCs)....
- 0
- 1
- 0
- 1
-
-
-
-
-
-
-
-
-```python
-# esport JSON
-result = oas.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/oa.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-oas.to_csv('sample/oa.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-oas.to_excel('sample/oa.xlsx', index=False)
-```
-
-## Table Journals
-
-
-```python
-issns = pd.read_csv('issn/issns_count.tsv', encoding='utf-8', header=0, sep='\t')
-issns
-```
-
-
-
-
-
-
-
-
-
-
- issn
- count_unige
- count_epfl
- count
-
-
-
-
- 0
- 1660-9379
- 1654.0
- 2.0
- 1656.0
-
-
- 1
- 0031-9007
- 602.0
- 678.0
- 1280.0
-
-
- 2
- 1932-6203
- 608.0
- 340.0
- 948.0
-
-
- 3
- 2174-8454
- 732.0
- 0.0
- 732.0
-
-
- 4
- 1098-0121
- 334.0
- 393.0
- 727.0
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 13593
- 1471-0153
- 1.0
- 0.0
- 1.0
-
-
- 13594
- 2257-5294
- 1.0
- 0.0
- 1.0
-
-
- 13595
- 0950-9240
- 1.0
- 0.0
- 1.0
-
-
- 13596
- 1868-1883
- 1.0
- 0.0
- 1.0
-
-
- 13597
- 1063-6889
- 0.0
- 1.0
- 1.0
-
-
-
-
13598 rows × 4 columns
-
-
-
-
-
-```python
-# ajout des colonnes
-issns.insert(0, 'id', '', False)
-issns
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- count_unige
- count_epfl
- count
-
-
-
-
- 0
-
- 1660-9379
- 1654.0
- 2.0
- 1656.0
-
-
- 1
-
- 0031-9007
- 602.0
- 678.0
- 1280.0
-
-
- 2
-
- 1932-6203
- 608.0
- 340.0
- 948.0
-
-
- 3
-
- 2174-8454
- 732.0
- 0.0
- 732.0
-
-
- 4
-
- 1098-0121
- 334.0
- 393.0
- 727.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 13593
-
- 1471-0153
- 1.0
- 0.0
- 1.0
-
-
- 13594
-
- 2257-5294
- 1.0
- 0.0
- 1.0
-
-
- 13595
-
- 0950-9240
- 1.0
- 0.0
- 1.0
-
-
- 13596
-
- 1868-1883
- 1.0
- 0.0
- 1.0
-
-
- 13597
-
- 1063-6889
- 0.0
- 1.0
- 1.0
-
-
-
-
13598 rows × 5 columns
-
-
-
-
-
-```python
-# convertir l'index en id
-issns = issns.reset_index()
-issns
-```
-
-
-
-
-
-
-
-
-
-
- index
- id
- issn
- count_unige
- count_epfl
- count
-
-
-
-
- 0
- 0
-
- 1660-9379
- 1654.0
- 2.0
- 1656.0
-
-
- 1
- 1
-
- 0031-9007
- 602.0
- 678.0
- 1280.0
-
-
- 2
- 2
-
- 1932-6203
- 608.0
- 340.0
- 948.0
-
-
- 3
- 3
-
- 2174-8454
- 732.0
- 0.0
- 732.0
-
-
- 4
- 4
-
- 1098-0121
- 334.0
- 393.0
- 727.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 13593
- 13593
-
- 1471-0153
- 1.0
- 0.0
- 1.0
-
-
- 13594
- 13594
-
- 2257-5294
- 1.0
- 0.0
- 1.0
-
-
- 13595
- 13595
-
- 0950-9240
- 1.0
- 0.0
- 1.0
-
-
- 13596
- 13596
-
- 1868-1883
- 1.0
- 0.0
- 1.0
-
-
- 13597
- 13597
-
- 1063-6889
- 0.0
- 1.0
- 1.0
-
-
-
-
13598 rows × 6 columns
-
-
-
-
-
-```python
-# ajout de l'id avec l'index + 1
-issns['id'] = issns['index'] + 1
-del issns['index']
-issns
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- count_unige
- count_epfl
- count
-
-
-
-
- 0
- 1
- 1660-9379
- 1654.0
- 2.0
- 1656.0
-
-
- 1
- 2
- 0031-9007
- 602.0
- 678.0
- 1280.0
-
-
- 2
- 3
- 1932-6203
- 608.0
- 340.0
- 948.0
-
-
- 3
- 4
- 2174-8454
- 732.0
- 0.0
- 732.0
-
-
- 4
- 5
- 1098-0121
- 334.0
- 393.0
- 727.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 13593
- 13594
- 1471-0153
- 1.0
- 0.0
- 1.0
-
-
- 13594
- 13595
- 2257-5294
- 1.0
- 0.0
- 1.0
-
-
- 13595
- 13596
- 0950-9240
- 1.0
- 0.0
- 1.0
-
-
- 13596
- 13597
- 1868-1883
- 1.0
- 0.0
- 1.0
-
-
- 13597
- 13598
- 1063-6889
- 0.0
- 1.0
- 1.0
-
-
-
-
13598 rows × 5 columns
-
-
-
-
-
-```python
-# reduction à X journaux pour l'échantillon de test
-if journals_sample_n > 0 :
- issns = issns.loc[:journals_sample_n]
-issns
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- count_unige
- count_epfl
- count
-
-
-
-
- 0
- 1
- 1660-9379
- 1654.0
- 2.0
- 1656.0
-
-
- 1
- 2
- 0031-9007
- 602.0
- 678.0
- 1280.0
-
-
- 2
- 3
- 1932-6203
- 608.0
- 340.0
- 948.0
-
-
- 3
- 4
- 2174-8454
- 732.0
- 0.0
- 732.0
-
-
- 4
- 5
- 1098-0121
- 334.0
- 393.0
- 727.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 996
- 997
- 0964-1726
- 1.0
- 20.0
- 21.0
-
-
- 997
- 998
- 0022-3468
- 21.0
- 0.0
- 21.0
-
-
- 998
- 999
- 1432-2064
- 17.0
- 4.0
- 21.0
-
-
- 999
- 1000
- 0960-1481
- 5.0
- 16.0
- 21.0
-
-
- 1000
- 1001
- 0161-7567
- 21.0
- 0.0
- 21.0
-
-
-
-
1001 rows × 5 columns
-
-
-
-
-
-```python
-# ajout des ISSN-L
-df_issnl = pd.read_csv('issn/20171102.ISSN-to-ISSN-L.txt', encoding='utf-8', header=0, sep='\t')
-df_issnl
-```
-
-
-
-
-
-
-
-
-
-
- ISSN
- ISSN-L
-
-
-
-
- 0
- 0000-0019
- 0000-0019
-
-
- 1
- 0000-0027
- 0000-0027
-
-
- 2
- 0000-0043
- 0000-0043
-
-
- 3
- 0000-0051
- 0000-0051
-
-
- 4
- 0000-006X
- 0000-006X
-
-
- ...
- ...
- ...
-
-
- 1995913
- 8756-9957
- 8756-9957
-
-
- 1995914
- 8756-9965
- 8756-9965
-
-
- 1995915
- 8756-9973
- 8756-9973
-
-
- 1995916
- 8756-9981
- 8756-9981
-
-
- 1995917
- 8756-999X
- 8756-999X
-
-
-
-
1995918 rows × 2 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-df_issnl = df_issnl.rename(columns={'ISSN' : 'issn', 'ISSN-L' : 'issnl'})
-```
-
-
-```python
-issns = pd.merge(issns, df_issnl, on='issn', how='left')
-issns
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- count_unige
- count_epfl
- count
- issnl
-
-
-
-
- 0
- 1
- 1660-9379
- 1654.0
- 2.0
- 1656.0
- 1660-9379
-
-
- 1
- 2
- 0031-9007
- 602.0
- 678.0
- 1280.0
- 0031-9007
-
-
- 2
- 3
- 1932-6203
- 608.0
- 340.0
- 948.0
- 1932-6203
-
-
- 3
- 4
- 2174-8454
- 732.0
- 0.0
- 732.0
- 2174-8454
-
-
- 4
- 5
- 1098-0121
- 334.0
- 393.0
- 727.0
- 1098-0121
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 996
- 997
- 0964-1726
- 1.0
- 20.0
- 21.0
- 0964-1726
-
-
- 997
- 998
- 0022-3468
- 21.0
- 0.0
- 21.0
- 0022-3468
-
-
- 998
- 999
- 1432-2064
- 17.0
- 4.0
- 21.0
- 0178-8051
-
-
- 999
- 1000
- 0960-1481
- 5.0
- 16.0
- 21.0
- 0960-1481
-
-
- 1000
- 1001
- 0161-7567
- 21.0
- 0.0
- 21.0
- 0161-7567
-
-
-
-
1001 rows × 6 columns
-
-
-
-
-
-```python
-# creation du DF
-# 'oa_status' supprimé pour le moment
-col_names = ['id',
- 'issn',
- 'issnl',
- 'title',
- 'starting_year',
- 'end_year',
- 'url',
- 'name_short_iso_4'
- ]
-journals = pd.DataFrame(columns = col_names)
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
-
-
-
-
-
-
-
-
-
-
-```python
-# creation du DF
-col_names = ['id', 'iso_code']
-journals_languages = pd.DataFrame(columns = col_names)
-journals_languages
-```
-
-
-
-
-
-
-
-
-
-
- id
- iso_code
-
-
-
-
-
-
-
-
-
-
-```python
-# creation du DF
-# 'oa_status' supprimé
-col_names = ['id', 'iso_code']
-journals_countries = pd.DataFrame(columns = col_names)
-journals_countries
-```
-
-
-
-
-
-
-
-
-
-
- id
- iso_code
-
-
-
-
-
-
-
-
-
-
-```python
-# extraction des informations à partir des données ISSN.org
-for index, row in issns.iterrows():
- myid = row['id']
- myissn = row['issn']
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- # initialisation des variables à extraire
- issnl = np.nan
- title = ''
- keytitle = ''
- starting_year = np.nan
- end_year = np.nan
- myurl = np.nan
- journal_country = np.nan
- journal_language = np.nan
- keytitle_abbr = np.nan
- # export en json
- if os.path.exists('issn/data/' + myissn + '.json'):
- with open('issn/data/' + myissn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- for x in data['@graph']:
- if ('@id' in x):
- if (x['@id'] == 'resource/ISSN/' + myissn):
- if ('mainTitle' in x):
- title = x['mainTitle']
- else :
- if ('name' in x):
- title = x['name']
- # print(myissn)
- if ('startDate' in x):
- starting_year = x['startDate']
- if ('endDate' in x):
- end_year = x['endDate']
- if ('url' in x):
- urls = x['url']
- if type(urls) is list:
- for url in urls:
- # Filtrer les URLs des archives :
- # www.ncbi.nlm.nih.gov/pmc/*
- # www.pubmedcentral.gov/*
- # pubmedcentral.nih.gov/*
- # bibpurl.oclc.org/*
- # www.jstor.org/*
- # ieeexplore.ieee.org
- # ovidsp.ovid.com
- # et garder le premier des restants
- myurl = url
- if ('ncbi.nlm.nih.gov' not in url
- and 'pubmedcentral' not in url
- and 'bibpurl.oclc.org' not in url
- and 'jstor.org' not in url
- and 'ieeexplore.ieee.org' not in url
- and 'ovidsp.ovid.com' not in url):
- break
- else :
- myurl = x['url']
- if ('spatial' in x):
- countries = x['spatial']
- if type(countries) is list:
- for country in countries:
- if ('https://www.iso.org/obp/ui/#iso:code:3166:' in country):
- journal_country = country[-2:]
- journals_countries = journals_countries.append({'id' : myid, 'iso_code' : journal_country}, ignore_index=True)
- else :
- if ('https://www.iso.org/obp/ui/#iso:code:3166:' in countries):
- journal_country = countries[-2:]
- journals_countries = journals_countries.append({'id' : myid, 'iso_code' : journal_country}, ignore_index=True)
- # langue "inLanguage": "http://id.loc.gov/vocabulary/iso639-2/eng",
- if ('inLanguage' in x):
- languages = x['inLanguage']
- if type(languages) is list:
- for language in languages:
- journal_language = language[-3:]
- journals_languages = journals_languages.append({'id' : myid, 'iso_code' : journal_language}, ignore_index=True)
- else :
- journal_language = languages[-3:]
- journals_languages = journals_languages.append({'id' : myid, 'iso_code' : journal_language}, ignore_index=True)
- if (x['@id'] == 'resource/ISSN/' + myissn + '#KeyTitle'):
- if ('value' in x):
- keytitle = x['value']
- if (x['@id'] == 'resource/ISSN/' + myissn + '#ISSN-L'):
- if ('value' in x):
- issnl = x['value']
- # "@id": "resource/ISSN/1098-0121#AbbreviatedKeyTitle",
- if (x['@id'] == 'resource/ISSN/' + myissn + '#AbbreviatedKeyTitle'):
- if ('value' in x):
- mykeytitle_abbrs = x['value']
- if type(mykeytitle_abbrs) is list:
- for mykeytitle_abbr in mykeytitle_abbrs:
- print(myissn + ' - AbbreviatedKeyTitle is a list ' + mykeytitle_abbr)
- keytitle_abbr = mykeytitle_abbr
- with open('sample/03_journals_issn_multiple_titles.txt', 'a', encoding='utf-8') as g:
- g.write(myissn + ' AbbreviatedKeyTitle is a list ' + mykeytitle_abbr + '\n')
- break
- else :
- keytitle_abbr = mykeytitle_abbrs
- if keytitle != '' :
- title = keytitle
- if title != '' :
- # supprimer le point à la fin
- if (title[-1] == '.'):
- title = title[0:-1]
- # remplacer les caractères spéciaux The
- if type(title) is list:
- for mytitlei in title:
- print(myissn + ' - title is a list ' + mytitlei)
- title = str.replace(mytitlei, 'The ', 'The ')
- with open('sample/03_journals_issn_multiple_titles.txt', 'a', encoding='utf-8') as g:
- g.write(myissn + ' title is a list ' + mytitlei + '\n')
- break
- else :
- title = str.replace(title, 'The ', 'The ')
- else :
- print(row['issn'] + ' - not found')
- with open('sample/03_journals_issn_errors.txt', 'a', encoding='utf-8') as g:
- g.write(row['issn'] + ' not found \n')
- journals.at[index,'id'] = myid
- journals.at[index,'title'] = title
- journals.at[index,'issn'] = myissn
- journals.at[index,'issnl'] = issnl
- journals.at[index,'starting_year'] = starting_year
- journals.at[index,'end_year'] = end_year
- journals.at[index,'url'] = myurl
- journals.at[index,'name_short_iso_4'] = keytitle_abbr
-```
-
- 0
- 10
- 1094-4087 - AbbreviatedKeyTitle is a list Opt Express
- 20
- 30
- 40
- 50
- 60
- 70
- 80
- 90
- 100
- 110
- 120
- 130
- 140
- 150
- 160
- 170
- 0899-823X - AbbreviatedKeyTitle is a list Infect. control hosp. epidemiol.
- 180
- 190
- 200
- 210
- 220
- 230
- 240
- 250
- 260
- 270
- 280
- 290
- 300
- 0370-693 - not found
- 310
- 320
- 330
- 340
- 350
- 360
- 370
- 380
- 390
- 400
- 410
- 420
- 430
- 440
- 450
- 460
- 470
- 480
- 490
- 500
- 510
- 520
- 530
- 540
- 1544-9173 - AbbreviatedKeyTitle is a list PLoS Biol
- 550
- 560
- 570
- 580
- 590
- 600
- 610
- 620
- 0277-86X - not found
- 630
- 640
- 650
- 0003-951 - not found
- 660
- 670
- 680
- 690
- 700
- 710
- 720
- 730
- 740
- 750
- 760
- 770
- 780
- 790
- 1931-3128 - AbbreviatedKeyTitle is a list Cell Host Microbe
- 800
- 810
- 820
- 830
- 840
- 850
- 860
- 870
- 880
- 890
- 900
- 910
- 920
- 930
- 940
- 950
- 960
- 970
- 980
- 990
- 1000
-
-
-
-```python
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 996
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
-
-
- 997
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
-
-
- 998
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
-
-
- 999
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
-
-
- 1000
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
-
-
-
-
1001 rows × 8 columns
-
-
-
-
-
-```python
-# titres vides
-journals.loc[journals['title'] == '']
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
-
-
-
-
- 309
- 310
- 0370-693
- NaN
-
- NaN
- NaN
- NaN
- NaN
-
-
- 361
- 362
- 0777-5466
- NaN
-
- ||||
- ||||
- NaN
- NaN
-
-
- 629
- 630
- 0277-86X
- NaN
-
- NaN
- NaN
- NaN
- NaN
-
-
- 656
- 657
- 0003-951
- NaN
-
- NaN
- NaN
- NaN
- NaN
-
-
- 840
- 841
- 1089-5647
- NaN
-
- NaN
- NaN
- NaN
- NaN
-
-
-
-
-
-
-
-
-```python
-# export csv des titres vides
-journals.loc[journals['title'] == ''].to_csv('sample/journals_sans_titre.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel des ids
-journals.loc[journals['title'] == ''].to_excel('sample/journals_sans_titre.xlsx', index=False)
-```
-
-
-```python
-# garder les lignes avec titre
-journals = journals.loc[journals['title'] != '']
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 996
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
-
-
- 997
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
-
-
- 998
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
-
-
- 999
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
-
-
- 1000
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
-
-
-
-
996 rows × 8 columns
-
-
-
-
-
-```python
-journals.shape[0]
-```
-
-
-
-
- 996
-
-
-
-## Languages
-
-
-```python
-journals_languages
-```
-
-
-
-
-
-
-
-
-
-
- id
- iso_code
-
-
-
-
- 0
- 1
- fre
-
-
- 1
- 2
- eng
-
-
- 2
- 3
- eng
-
-
- 3
- 4
- eng
-
-
- 4
- 4
- fre
-
-
- ...
- ...
- ...
-
-
- 1117
- 997
- eng
-
-
- 1118
- 998
- eng
-
-
- 1119
- 999
- eng
-
-
- 1120
- 1000
- eng
-
-
- 1121
- 1001
- eng
-
-
-
-
1122 rows × 2 columns
-
-
-
-
-
-```python
-# ouvrir la table des langues
-languages = pd.read_csv('sample/language.tsv', encoding='utf-8', header=0, sep='\t')
-languages
-```
-
-
-
-
-
-
-
-
-
-
- iso_code
- name
- id
-
-
-
-
- 0
- aar
- Afar
- 1
-
-
- 1
- abk
- Abkhazian
- 2
-
-
- 2
- ace
- Achinese
- 3
-
-
- 3
- ach
- Acoli
- 4
-
-
- 4
- ada
- Adangme
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 483
- zul
- Zulu
- 484
-
-
- 484
- zun
- Zuni
- 485
-
-
- 485
- zxx
- No linguistic content; Not applicable
- 486
-
-
- 486
- zza
- Zaza; Dimili; Dimli; Kirdki; Kirmanjki; Zazaki
- 487
-
-
- 487
- ___
- UNKNOWN
- 999999
-
-
-
-
488 rows × 3 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-del languages['name']
-languages = languages.rename(columns={'id' : 'language'})
-```
-
-
-```python
-# merge avec languages
-journals_languages = pd.merge(journals_languages, languages, on='iso_code', how='left')
-journals_languages
-```
-
-
-
-
-
-
-
-
-
-
- id
- iso_code
- language
-
-
-
-
- 0
- 1
- fre
- 138
-
-
- 1
- 2
- eng
- 124
-
-
- 2
- 3
- eng
- 124
-
-
- 3
- 4
- eng
- 124
-
-
- 4
- 4
- fre
- 138
-
-
- ...
- ...
- ...
- ...
-
-
- 1117
- 997
- eng
- 124
-
-
- 1118
- 998
- eng
- 124
-
-
- 1119
- 999
- eng
- 124
-
-
- 1120
- 1000
- eng
- 124
-
-
- 1121
- 1001
- eng
- 124
-
-
-
-
1122 rows × 3 columns
-
-
-
-
-
-```python
-# concat valeurs avec même id
-journals_languages['language'] = journals_languages['language'].astype(str)
-journals_languages = journals_languages.groupby('id').agg({'language': lambda x: ', '.join(x)})
-journals_languages
-```
-
-
-
-
-
-
-
-
-
-
- language
-
-
- id
-
-
-
-
-
- 1
- 138
-
-
- 2
- 124
-
-
- 3
- 124
-
-
- 4
- 124, 138, 402, 292
-
-
- 5
- 124
-
-
- ...
- ...
-
-
- 997
- 124
-
-
- 998
- 124
-
-
- 999
- 124
-
-
- 1000
- 124
-
-
- 1001
- 124
-
-
-
-
996 rows × 1 columns
-
-
-
-
-
-```python
-# recuperation de l'id des langues
-journals = pd.merge(journals, journals_languages, on='id', how='left')
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 991
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
-
-
- 992
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
-
-
- 993
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
-
-
- 994
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
-
-
- 995
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
-
-
-
-
996 rows × 9 columns
-
-
-
-
-## Countries
-
-
-```python
-journals_countries
-```
-
-
-
-
-
-
-
-
-
-
- id
- iso_code
-
-
-
-
- 0
- 1
- CH
-
-
- 1
- 2
- US
-
-
- 2
- 3
- US
-
-
- 3
- 4
- ES
-
-
- 4
- 5
- US
-
-
- ...
- ...
- ...
-
-
- 992
- 997
- GB
-
-
- 993
- 998
- US
-
-
- 994
- 999
- DE
-
-
- 995
- 1000
- GB
-
-
- 996
- 1001
- US
-
-
-
-
997 rows × 2 columns
-
-
-
-
-
-```python
-# ouvrir la table des pays
-country = pd.read_csv('sample/country.tsv', encoding='utf-8', header=0, sep='\t')
-country
-```
-
-
-
-
-
-
-
-
-
-
- name
- iso_code
- id
-
-
-
-
- 0
- Afghanistan
- AF
- 1
-
-
- 1
- Albania
- AL
- 2
-
-
- 2
- Algeria
- DZ
- 3
-
-
- 3
- American Samoa
- AS
- 4
-
-
- 4
- Andorra
- AD
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 246
- Zambia
- ZM
- 247
-
-
- 247
- Zimbabwe
- ZW
- 248
-
-
- 248
- Åland Islands
- AX
- 249
-
-
- 249
- International Agency
- OI
- 250
-
-
- 250
- UNKNOWN
- __
- 999999
-
-
-
-
251 rows × 3 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-del country['name']
-country = country.rename(columns={'id' : 'country'})
-```
-
-
-```python
-# merge avec countries
-journals_countries = pd.merge(journals_countries, country, on='iso_code', how='left')
-journals_countries
-```
-
-
-
-
-
-
-
-
-
-
- id
- iso_code
- country
-
-
-
-
- 0
- 1
- CH
- 215
-
-
- 1
- 2
- US
- 236
-
-
- 2
- 3
- US
- 236
-
-
- 3
- 4
- ES
- 209
-
-
- 4
- 5
- US
- 236
-
-
- ...
- ...
- ...
- ...
-
-
- 992
- 997
- GB
- 234
-
-
- 993
- 998
- US
- 236
-
-
- 994
- 999
- DE
- 83
-
-
- 995
- 1000
- GB
- 234
-
-
- 996
- 1001
- US
- 236
-
-
-
-
997 rows × 3 columns
-
-
-
-
-
-```python
-# concat valeurs avec même id
-journals_countries['country'] = journals_countries['country'].astype(str)
-journals_countries = journals_countries.groupby('id').agg({'country': lambda x: ', '.join(x)})
-journals_countries
-```
-
-
-
-
-
-
-
-
-
-
- country
-
-
- id
-
-
-
-
-
- 1
- 215
-
-
- 2
- 236
-
-
- 3
- 236
-
-
- 4
- 209
-
-
- 5
- 236
-
-
- ...
- ...
-
-
- 997
- 234
-
-
- 998
- 236
-
-
- 999
- 83
-
-
- 1000
- 234
-
-
- 1001
- 236
-
-
-
-
997 rows × 1 columns
-
-
-
-
-
-```python
-# recuperation de l'id des langues
-journals = pd.merge(journals, journals_countries, on='id', how='left')
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 991
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
-
-
- 992
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
-
-
- 993
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
-
-
- 994
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
-
-
- 995
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
-
-
-
-
996 rows × 10 columns
-
-
-
-
-### DOAJ
-
-
-```python
-# ajout de DOAJ info
-doaj = pd.read_csv('doaj/journalcsv__doaj_20210312_0636_utf8.csv', encoding='utf-8', header=0)
-doaj
-```
-
-
-
-
-
-
-
-
-
-
- Journal title
- Journal URL
- URL in DOAJ
- Alternative title
- Journal ISSN (print version)
- Journal EISSN (online version)
- Keywords
- Languages in which the journal accepts manuscripts
- Publisher
- Country of publisher
- ...
- URL for journal's Open Access statement
- Continues
- Continued By
- LCC Codes
- Subjects
- DOAJ Seal
- Added on Date
- Last updated Date
- Number of Article Records
- Most Recent Article Added
-
-
-
-
- 0
- Anais da Academia Brasileira de Ciências
- http://www.scielo.br/scielo.php?script=sci_ser...
- https://doaj.org/toc/ed09859a464f4461b1af34279...
- Annals of the Brazilian Academy of Sciences
- 0001-3765
- 1678-2690
- biological sciences, exact and earth sciences,...
- English
- Academia Brasileira de Ciências
- Brazil
- ...
- http://www.scielo.br/revistas/aabc/isubscrp.htm
- NaN
- NaN
- Q
- Science
- No
- 2004-04-23T21:31:00Z
- 2017-01-04T14:19:54Z
- 2649
- 2020-06-10T21:49:11Z
-
-
- 1
- ACME
- http://riviste.unimi.it/index.php/ACME
- https://doaj.org/toc/b1ca04ba56194f29a362b3eef...
- NaN
- 0001-494X
- 2282-0035
- italian literature, classic literature, lingui...
- Italian
- Università degli Studi di Milano
- Italy
- ...
- http://riviste.unimi.it/index.php/ACME/about/e...
- NaN
- NaN
- A
- General Works
- No
- 2014-12-22T19:55:58Z
- 2020-02-24T09:07:42Z
- 166
- 2020-06-19T09:42:34Z
-
-
- 2
- Acta Dermato-Venereologica
- http://www.medicaljournals.se/acta
- https://doaj.org/toc/ffde9666ab1d46f1a8c688ce6...
- NaN
- 0001-5555
- 1651-2057
- sexually transmitted infections, psoriasis, ps...
- English
- Society for Publication of Acta Dermato-Venere...
- Sweden
- ...
- https://www.medicaljournals.se/acta/open-acces...
- NaN
- NaN
- RL1-803
- Medicine: Dermatology
- No
- 2011-11-10T12:31:05Z
- 2017-02-22T11:14:48Z
- 1096
- 2021-03-11T13:41:33Z
-
-
- 3
- Acta Médica Costarricense
- http://actamedica.medicos.cr/index.php/Acta_Me...
- https://doaj.org/toc/a5919aee5ad2413a89cf32df0...
- NaN
- 0001-6012
- 2215-5856
- medicine, public health, medical sciences, health
- English, Spanish
- Colegio de Médicos y Cirujanos de Costa Rica
- Costa Rica
- ...
- http://actamedica.medicos.cr/index.php/Acta_Me...
- NaN
- NaN
- R
- Medicine
- No
- 2020-12-22T11:08:24Z
- 2020-12-22T11:08:24Z
- 1207
- 2015-12-08T15:06:43Z
-
-
- 4
- Acta Mycologica
- https://pbsociety.org.pl/journals/index.php/am...
- https://doaj.org/toc/0e8e2531ae3f455ebb49acb08...
- NaN
- 0001-625X
- 2353-074X
- mycology, micromycetes, marcomycetes, slime mo...
- English
- Polish Botanical Society
- Poland
- ...
- https://pbsociety.org.pl/journals/index.php/am...
- NaN
- NaN
- QH301-705.5
- Science: Biology (General)
- No
- 2014-05-29T20:02:32Z
- 2021-01-16T17:41:32Z
- 1154
- 2021-03-05T18:55:46Z
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- BME Frontiers
- https://spj.sciencemag.org/bmef
- https://doaj.org/toc/f9fa881c1be5443a86ed71c2e...
- Biomedical Engineering Frontiers
- NaN
- 2765-8031
- biomedical imaging, biomedical devices, biomat...
- English
- American Association for the Advancement of Sc...
- United States
- ...
- https://spj.sciencemag.org/bmef/about/
- NaN
- NaN
- R855-855.5|TP248.13-248.65
- Medicine: Medicine (General): Medical technolo...
- No
- 2021-01-22T11:54:20Z
- 2021-01-22T11:54:20Z
- 11
- 2021-03-08T09:06:36Z
-
-
- 16025
- Harvard Kennedy School Misinformation Review
- https://misinforeview.hks.harvard.edu
- https://doaj.org/toc/d71096ec7090499681cc0ccf8...
- HKS Misinformation Review
- NaN
- 2766-1652
- misinformation, disinformation, fake news
- English
- Harvard Kennedy School
- United States
- ...
- https://misinforeview.hks.harvard.edu/editoria...
- NaN
- NaN
- T58.5-58.64|P87-96
- Technology: Technology (General): Industrial e...
- No
- 2021-02-12T10:29:21Z
- 2021-02-12T10:29:21Z
- 0
- NaN
-
-
- 16026
- One Health & Risk Management
- https://journal.ohrm.bba.md/index.php/journal-...
- https://doaj.org/toc/68671b966cd24a0ebaa44d78f...
- OH&RM
- 2887-3458
- 2587-3466
- one health, risc management, public health, hu...
- English, Romanian, French, Russian
- Asociatia de Biosiguranta si Biosecuritate
- Moldova, Republic of
- ...
- https://journal.ohrm.bba.md/index.php/journal-...
- NaN
- NaN
- R|Q
- Medicine | Science
- No
- 2021-03-04T16:06:58Z
- 2021-03-04T16:06:58Z
- 4
- 2021-03-04T20:46:57Z
-
-
- 16027
- فصلنامه پژوهشهای مدیریت منابع انسانی
- https://hrmj.ihu.ac.ir/?lang=en
- https://doaj.org/toc/87d44ffb6ff849b18d5ddce9c...
- Journal of Research in Human Resources Management
- 8254-8002
- 2645-5072
- human resources management
- Persian
- Imam Hussein University
- Iran, Islamic Republic of
- ...
- https://hrmj.ihu.ac.ir/?lang=en
- NaN
- NaN
- HF5549-5549.5
- Social Sciences: Commerce: Business: Personnel...
- No
- 2021-01-20T11:27:05Z
- 2021-01-20T11:27:05Z
- 0
- NaN
-
-
- 16028
- Science of Tsunami Hazards
- http://tsunamisociety.org/
- https://doaj.org/toc/a4f06be11f4f4db489dc034c7...
- NaN
- 8755-6839
- NaN
- tsunamis, tsunami warning systems, earthquakes...
- English
- Tsunami Society International
- United States
- ...
- http://tsunamisociety.org/AboutUs.html
- NaN
- NaN
- GC1-1581
- Geography. Anthropology. Recreation: Oceanography
- No
- 2009-04-16T17:40:30Z
- 2016-07-21T16:09:38Z
- 239
- 2021-02-27T01:00:51Z
-
-
-
-
16029 rows × 53 columns
-
-
-
-
-
-```python
-# ajout ISSNL
-doaj['issn'] = doaj['Journal ISSN (print version)']
-doaj.loc[doaj['issn'].isna(), 'issn'] = doaj['Journal EISSN (online version)']
-doaj
-```
-
-
-
-
-
-
-
-
-
-
- Journal title
- Journal URL
- URL in DOAJ
- Alternative title
- Journal ISSN (print version)
- Journal EISSN (online version)
- Keywords
- Languages in which the journal accepts manuscripts
- Publisher
- Country of publisher
- ...
- Continues
- Continued By
- LCC Codes
- Subjects
- DOAJ Seal
- Added on Date
- Last updated Date
- Number of Article Records
- Most Recent Article Added
- issn
-
-
-
-
- 0
- Anais da Academia Brasileira de Ciências
- http://www.scielo.br/scielo.php?script=sci_ser...
- https://doaj.org/toc/ed09859a464f4461b1af34279...
- Annals of the Brazilian Academy of Sciences
- 0001-3765
- 1678-2690
- biological sciences, exact and earth sciences,...
- English
- Academia Brasileira de Ciências
- Brazil
- ...
- NaN
- NaN
- Q
- Science
- No
- 2004-04-23T21:31:00Z
- 2017-01-04T14:19:54Z
- 2649
- 2020-06-10T21:49:11Z
- 0001-3765
-
-
- 1
- ACME
- http://riviste.unimi.it/index.php/ACME
- https://doaj.org/toc/b1ca04ba56194f29a362b3eef...
- NaN
- 0001-494X
- 2282-0035
- italian literature, classic literature, lingui...
- Italian
- Università degli Studi di Milano
- Italy
- ...
- NaN
- NaN
- A
- General Works
- No
- 2014-12-22T19:55:58Z
- 2020-02-24T09:07:42Z
- 166
- 2020-06-19T09:42:34Z
- 0001-494X
-
-
- 2
- Acta Dermato-Venereologica
- http://www.medicaljournals.se/acta
- https://doaj.org/toc/ffde9666ab1d46f1a8c688ce6...
- NaN
- 0001-5555
- 1651-2057
- sexually transmitted infections, psoriasis, ps...
- English
- Society for Publication of Acta Dermato-Venere...
- Sweden
- ...
- NaN
- NaN
- RL1-803
- Medicine: Dermatology
- No
- 2011-11-10T12:31:05Z
- 2017-02-22T11:14:48Z
- 1096
- 2021-03-11T13:41:33Z
- 0001-5555
-
-
- 3
- Acta Médica Costarricense
- http://actamedica.medicos.cr/index.php/Acta_Me...
- https://doaj.org/toc/a5919aee5ad2413a89cf32df0...
- NaN
- 0001-6012
- 2215-5856
- medicine, public health, medical sciences, health
- English, Spanish
- Colegio de Médicos y Cirujanos de Costa Rica
- Costa Rica
- ...
- NaN
- NaN
- R
- Medicine
- No
- 2020-12-22T11:08:24Z
- 2020-12-22T11:08:24Z
- 1207
- 2015-12-08T15:06:43Z
- 0001-6012
-
-
- 4
- Acta Mycologica
- https://pbsociety.org.pl/journals/index.php/am...
- https://doaj.org/toc/0e8e2531ae3f455ebb49acb08...
- NaN
- 0001-625X
- 2353-074X
- mycology, micromycetes, marcomycetes, slime mo...
- English
- Polish Botanical Society
- Poland
- ...
- NaN
- NaN
- QH301-705.5
- Science: Biology (General)
- No
- 2014-05-29T20:02:32Z
- 2021-01-16T17:41:32Z
- 1154
- 2021-03-05T18:55:46Z
- 0001-625X
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- BME Frontiers
- https://spj.sciencemag.org/bmef
- https://doaj.org/toc/f9fa881c1be5443a86ed71c2e...
- Biomedical Engineering Frontiers
- NaN
- 2765-8031
- biomedical imaging, biomedical devices, biomat...
- English
- American Association for the Advancement of Sc...
- United States
- ...
- NaN
- NaN
- R855-855.5|TP248.13-248.65
- Medicine: Medicine (General): Medical technolo...
- No
- 2021-01-22T11:54:20Z
- 2021-01-22T11:54:20Z
- 11
- 2021-03-08T09:06:36Z
- 2765-8031
-
-
- 16025
- Harvard Kennedy School Misinformation Review
- https://misinforeview.hks.harvard.edu
- https://doaj.org/toc/d71096ec7090499681cc0ccf8...
- HKS Misinformation Review
- NaN
- 2766-1652
- misinformation, disinformation, fake news
- English
- Harvard Kennedy School
- United States
- ...
- NaN
- NaN
- T58.5-58.64|P87-96
- Technology: Technology (General): Industrial e...
- No
- 2021-02-12T10:29:21Z
- 2021-02-12T10:29:21Z
- 0
- NaN
- 2766-1652
-
-
- 16026
- One Health & Risk Management
- https://journal.ohrm.bba.md/index.php/journal-...
- https://doaj.org/toc/68671b966cd24a0ebaa44d78f...
- OH&RM
- 2887-3458
- 2587-3466
- one health, risc management, public health, hu...
- English, Romanian, French, Russian
- Asociatia de Biosiguranta si Biosecuritate
- Moldova, Republic of
- ...
- NaN
- NaN
- R|Q
- Medicine | Science
- No
- 2021-03-04T16:06:58Z
- 2021-03-04T16:06:58Z
- 4
- 2021-03-04T20:46:57Z
- 2887-3458
-
-
- 16027
- فصلنامه پژوهشهای مدیریت منابع انسانی
- https://hrmj.ihu.ac.ir/?lang=en
- https://doaj.org/toc/87d44ffb6ff849b18d5ddce9c...
- Journal of Research in Human Resources Management
- 8254-8002
- 2645-5072
- human resources management
- Persian
- Imam Hussein University
- Iran, Islamic Republic of
- ...
- NaN
- NaN
- HF5549-5549.5
- Social Sciences: Commerce: Business: Personnel...
- No
- 2021-01-20T11:27:05Z
- 2021-01-20T11:27:05Z
- 0
- NaN
- 8254-8002
-
-
- 16028
- Science of Tsunami Hazards
- http://tsunamisociety.org/
- https://doaj.org/toc/a4f06be11f4f4db489dc034c7...
- NaN
- 8755-6839
- NaN
- tsunamis, tsunami warning systems, earthquakes...
- English
- Tsunami Society International
- United States
- ...
- NaN
- NaN
- GC1-1581
- Geography. Anthropology. Recreation: Oceanography
- No
- 2009-04-16T17:40:30Z
- 2016-07-21T16:09:38Z
- 239
- 2021-02-27T01:00:51Z
- 8755-6839
-
-
-
-
16029 rows × 54 columns
-
-
-
-
-
-```python
-doaj = pd.merge(doaj, df_issnl, on='issn', how='left')
-doaj
-```
-
-
-
-
-
-
-
-
-
-
- Journal title
- Journal URL
- URL in DOAJ
- Alternative title
- Journal ISSN (print version)
- Journal EISSN (online version)
- Keywords
- Languages in which the journal accepts manuscripts
- Publisher
- Country of publisher
- ...
- Continued By
- LCC Codes
- Subjects
- DOAJ Seal
- Added on Date
- Last updated Date
- Number of Article Records
- Most Recent Article Added
- issn
- issnl
-
-
-
-
- 0
- Anais da Academia Brasileira de Ciências
- http://www.scielo.br/scielo.php?script=sci_ser...
- https://doaj.org/toc/ed09859a464f4461b1af34279...
- Annals of the Brazilian Academy of Sciences
- 0001-3765
- 1678-2690
- biological sciences, exact and earth sciences,...
- English
- Academia Brasileira de Ciências
- Brazil
- ...
- NaN
- Q
- Science
- No
- 2004-04-23T21:31:00Z
- 2017-01-04T14:19:54Z
- 2649
- 2020-06-10T21:49:11Z
- 0001-3765
- 0001-3765
-
-
- 1
- ACME
- http://riviste.unimi.it/index.php/ACME
- https://doaj.org/toc/b1ca04ba56194f29a362b3eef...
- NaN
- 0001-494X
- 2282-0035
- italian literature, classic literature, lingui...
- Italian
- Università degli Studi di Milano
- Italy
- ...
- NaN
- A
- General Works
- No
- 2014-12-22T19:55:58Z
- 2020-02-24T09:07:42Z
- 166
- 2020-06-19T09:42:34Z
- 0001-494X
- 0001-494X
-
-
- 2
- Acta Dermato-Venereologica
- http://www.medicaljournals.se/acta
- https://doaj.org/toc/ffde9666ab1d46f1a8c688ce6...
- NaN
- 0001-5555
- 1651-2057
- sexually transmitted infections, psoriasis, ps...
- English
- Society for Publication of Acta Dermato-Venere...
- Sweden
- ...
- NaN
- RL1-803
- Medicine: Dermatology
- No
- 2011-11-10T12:31:05Z
- 2017-02-22T11:14:48Z
- 1096
- 2021-03-11T13:41:33Z
- 0001-5555
- 0001-5555
-
-
- 3
- Acta Médica Costarricense
- http://actamedica.medicos.cr/index.php/Acta_Me...
- https://doaj.org/toc/a5919aee5ad2413a89cf32df0...
- NaN
- 0001-6012
- 2215-5856
- medicine, public health, medical sciences, health
- English, Spanish
- Colegio de Médicos y Cirujanos de Costa Rica
- Costa Rica
- ...
- NaN
- R
- Medicine
- No
- 2020-12-22T11:08:24Z
- 2020-12-22T11:08:24Z
- 1207
- 2015-12-08T15:06:43Z
- 0001-6012
- 0001-6012
-
-
- 4
- Acta Mycologica
- https://pbsociety.org.pl/journals/index.php/am...
- https://doaj.org/toc/0e8e2531ae3f455ebb49acb08...
- NaN
- 0001-625X
- 2353-074X
- mycology, micromycetes, marcomycetes, slime mo...
- English
- Polish Botanical Society
- Poland
- ...
- NaN
- QH301-705.5
- Science: Biology (General)
- No
- 2014-05-29T20:02:32Z
- 2021-01-16T17:41:32Z
- 1154
- 2021-03-05T18:55:46Z
- 0001-625X
- 0001-625X
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- BME Frontiers
- https://spj.sciencemag.org/bmef
- https://doaj.org/toc/f9fa881c1be5443a86ed71c2e...
- Biomedical Engineering Frontiers
- NaN
- 2765-8031
- biomedical imaging, biomedical devices, biomat...
- English
- American Association for the Advancement of Sc...
- United States
- ...
- NaN
- R855-855.5|TP248.13-248.65
- Medicine: Medicine (General): Medical technolo...
- No
- 2021-01-22T11:54:20Z
- 2021-01-22T11:54:20Z
- 11
- 2021-03-08T09:06:36Z
- 2765-8031
- NaN
-
-
- 16025
- Harvard Kennedy School Misinformation Review
- https://misinforeview.hks.harvard.edu
- https://doaj.org/toc/d71096ec7090499681cc0ccf8...
- HKS Misinformation Review
- NaN
- 2766-1652
- misinformation, disinformation, fake news
- English
- Harvard Kennedy School
- United States
- ...
- NaN
- T58.5-58.64|P87-96
- Technology: Technology (General): Industrial e...
- No
- 2021-02-12T10:29:21Z
- 2021-02-12T10:29:21Z
- 0
- NaN
- 2766-1652
- NaN
-
-
- 16026
- One Health & Risk Management
- https://journal.ohrm.bba.md/index.php/journal-...
- https://doaj.org/toc/68671b966cd24a0ebaa44d78f...
- OH&RM
- 2887-3458
- 2587-3466
- one health, risc management, public health, hu...
- English, Romanian, French, Russian
- Asociatia de Biosiguranta si Biosecuritate
- Moldova, Republic of
- ...
- NaN
- R|Q
- Medicine | Science
- No
- 2021-03-04T16:06:58Z
- 2021-03-04T16:06:58Z
- 4
- 2021-03-04T20:46:57Z
- 2887-3458
- NaN
-
-
- 16027
- فصلنامه پژوهشهای مدیریت منابع انسانی
- https://hrmj.ihu.ac.ir/?lang=en
- https://doaj.org/toc/87d44ffb6ff849b18d5ddce9c...
- Journal of Research in Human Resources Management
- 8254-8002
- 2645-5072
- human resources management
- Persian
- Imam Hussein University
- Iran, Islamic Republic of
- ...
- NaN
- HF5549-5549.5
- Social Sciences: Commerce: Business: Personnel...
- No
- 2021-01-20T11:27:05Z
- 2021-01-20T11:27:05Z
- 0
- NaN
- 8254-8002
- NaN
-
-
- 16028
- Science of Tsunami Hazards
- http://tsunamisociety.org/
- https://doaj.org/toc/a4f06be11f4f4db489dc034c7...
- NaN
- 8755-6839
- NaN
- tsunamis, tsunami warning systems, earthquakes...
- English
- Tsunami Society International
- United States
- ...
- NaN
- GC1-1581
- Geography. Anthropology. Recreation: Oceanography
- No
- 2009-04-16T17:40:30Z
- 2016-07-21T16:09:38Z
- 239
- 2021-02-27T01:00:51Z
- 8755-6839
- 8755-6839
-
-
-
-
16029 rows × 55 columns
-
-
-
-
-
-```python
-doaj.columns
-```
-
-
-
-
- Index(['Journal title', 'Journal URL', 'URL in DOAJ', 'Alternative title',
- 'Journal ISSN (print version)', 'Journal EISSN (online version)',
- 'Keywords', 'Languages in which the journal accepts manuscripts',
- 'Publisher', 'Country of publisher', 'Society or institution',
- 'Country of society or institution', 'Journal license',
- 'License attributes', 'URL for license terms',
- 'Machine-readable CC licensing information embedded or displayed in articles',
- 'URL to an example page with embedded licensing information',
- 'Author holds copyright without restrictions',
- 'Copyright information URL', 'Review process',
- 'Review process information URL', 'Journal plagiarism screening policy',
- 'Plagiarism information URL', 'URL for journal's aims & scope',
- 'URL for the Editorial Board page',
- 'URL for journal's instructions for authors',
- 'Average number of weeks between article submission and publication',
- 'APC', 'APC information URL', 'APC amount',
- 'Journal waiver policy (for developing country authors etc)',
- 'Waiver policy information URL', 'Has other fees',
- 'Other submission fees information URL', 'Preservation Services',
- 'Preservation Service: national library',
- 'Preservation information URL', 'Deposit policy directory',
- 'URL for deposit policy', 'Persistent article identifiers',
- 'Article metadata includes ORCIDs',
- 'Journal complies with I4OC standards for open citations',
- 'Does this journal allow unrestricted reuse in compliance with BOAI?',
- 'URL for journal's Open Access statement', 'Continues', 'Continued By',
- 'LCC Codes', 'Subjects', 'DOAJ Seal', 'Added on Date',
- 'Last updated Date', 'Number of Article Records',
- 'Most Recent Article Added', 'issn', 'issnl'],
- dtype='object')
-
-
-
-
-```python
-doaj['Preservation Services']
-```
-
-
-
-
- 0 NaN
- 1 NaN
- 2 NaN
- 3 PKP PN
- 4 NaN
- ...
- 16024 NaN
- 16025 NaN
- 16026 NaN
- 16027 NaN
- 16028 NaN
- Name: Preservation Services, Length: 16029, dtype: object
-
-
-
-
-```python
-doaj['DOAJ Seal']
-```
-
-
-
-
- 0 No
- 1 No
- 2 No
- 3 No
- 4 No
- ..
- 16024 No
- 16025 No
- 16026 No
- 16027 No
- 16028 No
- Name: DOAJ Seal, Length: 16029, dtype: object
-
-
-
-
-```python
-doaj['issnl']
-```
-
-
-
-
- 0 0001-3765
- 1 0001-494X
- 2 0001-5555
- 3 0001-6012
- 4 0001-625X
- ...
- 16024 NaN
- 16025 NaN
- 16026 NaN
- 16027 NaN
- 16028 8755-6839
- Name: issnl, Length: 16029, dtype: object
-
-
-
-
-```python
-doaj['APC'].value_counts()
-```
-
-
-
-
- No 11567
- Yes 4462
- Name: APC, dtype: int64
-
-
-
-
-```python
-# ajout des infos de DOAJ :
-# Journal title
-# DOAJ Seal
-doaj_for_merge = doaj[['issnl', 'Journal title', 'DOAJ Seal', 'APC']]
-doaj_for_merge
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- Journal title
- DOAJ Seal
- APC
-
-
-
-
- 0
- 0001-3765
- Anais da Academia Brasileira de Ciências
- No
- No
-
-
- 1
- 0001-494X
- ACME
- No
- No
-
-
- 2
- 0001-5555
- Acta Dermato-Venereologica
- No
- Yes
-
-
- 3
- 0001-6012
- Acta Médica Costarricense
- No
- No
-
-
- 4
- 0001-625X
- Acta Mycologica
- No
- Yes
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- NaN
- BME Frontiers
- No
- No
-
-
- 16025
- NaN
- Harvard Kennedy School Misinformation Review
- No
- No
-
-
- 16026
- NaN
- One Health & Risk Management
- No
- No
-
-
- 16027
- NaN
- فصلنامه پژوهشهای مدیریت منابع انسانی
- No
- No
-
-
- 16028
- 8755-6839
- Science of Tsunami Hazards
- No
- No
-
-
-
-
16029 rows × 4 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-doaj_for_merge = doaj_for_merge.rename(columns={'Journal title' : 'doaj_title', 'DOAJ Seal' : 'doaj_seal'})
-doaj_for_merge
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- doaj_title
- doaj_seal
- APC
-
-
-
-
- 0
- 0001-3765
- Anais da Academia Brasileira de Ciências
- No
- No
-
-
- 1
- 0001-494X
- ACME
- No
- No
-
-
- 2
- 0001-5555
- Acta Dermato-Venereologica
- No
- Yes
-
-
- 3
- 0001-6012
- Acta Médica Costarricense
- No
- No
-
-
- 4
- 0001-625X
- Acta Mycologica
- No
- Yes
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- NaN
- BME Frontiers
- No
- No
-
-
- 16025
- NaN
- Harvard Kennedy School Misinformation Review
- No
- No
-
-
- 16026
- NaN
- One Health & Risk Management
- No
- No
-
-
- 16027
- NaN
- فصلنامه پژوهشهای مدیریت منابع انسانی
- No
- No
-
-
- 16028
- 8755-6839
- Science of Tsunami Hazards
- No
- No
-
-
-
-
16029 rows × 4 columns
-
-
-
-
-
-```python
-# merge avec journals
-journals = pd.merge(journals, doaj_for_merge, on='issnl', how='left')
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- Yes
- Yes
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 991
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- NaN
- NaN
- NaN
-
-
- 992
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
-
-
- 993
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
-
-
- 994
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
-
-
- 995
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
-
-
-
-
996 rows × 13 columns
-
-
-
-
-
-```python
-# ajouter info sur la presence sur DOAJ ou du seal
-journals.loc[journals['doaj_title'].isna(), 'doaj_status'] = 0
-journals.loc[~journals['doaj_title'].isna(), 'doaj_status'] = 1
-journals.loc[journals['doaj_seal'] == 'Yes', 'doaj_seal'] = 1
-journals.loc[journals['doaj_seal'] == 'No', 'doaj_seal'] = 0
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1
- Yes
- 1.0
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 991
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
-
-
- 992
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
-
-
- 993
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
-
-
- 994
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
-
-
- 995
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
-
-
-
-
996 rows × 14 columns
-
-
-
-
-### LOCKSS
-
-
-```python
-# ajout des infos de preservation LOCKSS, Portico et Licences Nationales
-lockss = pd.read_csv('lockss/keepers-LOCKSS-report.csv', encoding='utf-8', header=0, skiprows=1)
-lockss
-```
-
-
-
-
-
-
-
-
-
-
- Publisher
- Title
- ISSN
- eISSN
- Preserved Volumes
- Preserved Years
- In Progress Volumes
- In Progress Years
-
-
-
-
- 0
- ARKAT USA
- ARKIVOC
- 1551-7004
- 1551-7012
- 2000; 2001; 2002; 2003; 2004; 2004; 2004; 2005
- 2000; 2001; 2002; 2003; 2004; 2004; 2004; 2005
- NaN
- NaN
-
-
- 1
- Ab Imperio
- Ab Imperio
- 2166-4072
- 2164-9731
- 2005; 2006; 2007; 2008; 2009; 2010; 2011; 2012...
- 2000; 2001; 2002; 2003; 2004; 2005; 2005; 2006...
- NaN
- 2020
-
-
- 2
- Absinthe Literary Review
- Absinthe Literary Review
- NaN
- 1939-0343
- NaN
- 2003; 2004; 2005
- NaN
- NaN
-
-
- 3
- Academy Health
- eGEMs
- NaN
- 2327-9214
- 1; 2; 2; 3; 4
- 2013; 2014; 2014; 2015; 2016
- NaN
- NaN
-
-
- 4
- Academy of American Franciscan History
- The Americas
- 0003-1615
- 1533-6247
- 57; 58; 59; 60; 61; 62; 63; 64; 65; 66; 67; 68...
- 2000; 2001; 2002; 2003; 2004; 2005; 2006; 2007...
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 14988
- Youngstown State University Center for Judaic ...
- Journal of Jewish Identities
- 1946-2522
- 1939-7941
- 1; 2; 3; 4; 5; 6; 7; 8
- 2008; 2009; 2010; 2011; 2012; 2013; 2014; 2015
- NaN
- NaN
-
-
- 14989
- Zoological Society of Japan
- Zoological Science
- 0289-0003
- NaN
- 12; 13; 14; 15; 16; 17; 18; 19; 20; 21; 22; 23...
- 1995; 1996; 1997; 1998; 1999; 2000; 2001; 2002...
- NaN
- NaN
-
-
- 14990
- Zoological Society of Southern Africa
- African Zoology
- 1562-7020
- 2224-073X
- 41; 42; 43; 44; 45; 46; 47; 48; 49; 50; 51; 52
- 2006; 2007; 2008; 2009; 2010; 2011; 2012; 2013...
- NaN
- NaN
-
-
- 14991
- eLife Sciences Publications
- eLife
- NaN
- 2050-084X
- NaN
- 2014; 2014; 2014; 2014; 2014; 2014; 2014; 2014...
- NaN
- NaN
-
-
- 14992
- frommann-holzboog
- Steiner Studies
- NaN
- 2698-217X
- NaN
- NaN
- 1
- 2020
-
-
-
-
14993 rows × 8 columns
-
-
-
-
-
-```python
-# ajout ISSNL
-lockss['issn'] = lockss['eISSN']
-lockss.loc[lockss['eISSN'].isna(), 'issn'] = lockss['ISSN']
-lockss
-```
-
-
-
-
-
-
-
-
-
-
- Publisher
- Title
- ISSN
- eISSN
- Preserved Volumes
- Preserved Years
- In Progress Volumes
- In Progress Years
- issn
-
-
-
-
- 0
- ARKAT USA
- ARKIVOC
- 1551-7004
- 1551-7012
- 2000; 2001; 2002; 2003; 2004; 2004; 2004; 2005
- 2000; 2001; 2002; 2003; 2004; 2004; 2004; 2005
- NaN
- NaN
- 1551-7012
-
-
- 1
- Ab Imperio
- Ab Imperio
- 2166-4072
- 2164-9731
- 2005; 2006; 2007; 2008; 2009; 2010; 2011; 2012...
- 2000; 2001; 2002; 2003; 2004; 2005; 2005; 2006...
- NaN
- 2020
- 2164-9731
-
-
- 2
- Absinthe Literary Review
- Absinthe Literary Review
- NaN
- 1939-0343
- NaN
- 2003; 2004; 2005
- NaN
- NaN
- 1939-0343
-
-
- 3
- Academy Health
- eGEMs
- NaN
- 2327-9214
- 1; 2; 2; 3; 4
- 2013; 2014; 2014; 2015; 2016
- NaN
- NaN
- 2327-9214
-
-
- 4
- Academy of American Franciscan History
- The Americas
- 0003-1615
- 1533-6247
- 57; 58; 59; 60; 61; 62; 63; 64; 65; 66; 67; 68...
- 2000; 2001; 2002; 2003; 2004; 2005; 2006; 2007...
- NaN
- NaN
- 1533-6247
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 14988
- Youngstown State University Center for Judaic ...
- Journal of Jewish Identities
- 1946-2522
- 1939-7941
- 1; 2; 3; 4; 5; 6; 7; 8
- 2008; 2009; 2010; 2011; 2012; 2013; 2014; 2015
- NaN
- NaN
- 1939-7941
-
-
- 14989
- Zoological Society of Japan
- Zoological Science
- 0289-0003
- NaN
- 12; 13; 14; 15; 16; 17; 18; 19; 20; 21; 22; 23...
- 1995; 1996; 1997; 1998; 1999; 2000; 2001; 2002...
- NaN
- NaN
- 0289-0003
-
-
- 14990
- Zoological Society of Southern Africa
- African Zoology
- 1562-7020
- 2224-073X
- 41; 42; 43; 44; 45; 46; 47; 48; 49; 50; 51; 52
- 2006; 2007; 2008; 2009; 2010; 2011; 2012; 2013...
- NaN
- NaN
- 2224-073X
-
-
- 14991
- eLife Sciences Publications
- eLife
- NaN
- 2050-084X
- NaN
- 2014; 2014; 2014; 2014; 2014; 2014; 2014; 2014...
- NaN
- NaN
- 2050-084X
-
-
- 14992
- frommann-holzboog
- Steiner Studies
- NaN
- 2698-217X
- NaN
- NaN
- 1
- 2020
- 2698-217X
-
-
-
-
14993 rows × 9 columns
-
-
-
-
-
-```python
-lockss = pd.merge(lockss, df_issnl, on='issn', how='left')
-lockss
-```
-
-
-
-
-
-
-
-
-
-
- Publisher
- Title
- ISSN
- eISSN
- Preserved Volumes
- Preserved Years
- In Progress Volumes
- In Progress Years
- issn
- issnl
-
-
-
-
- 0
- ARKAT USA
- ARKIVOC
- 1551-7004
- 1551-7012
- 2000; 2001; 2002; 2003; 2004; 2004; 2004; 2005
- 2000; 2001; 2002; 2003; 2004; 2004; 2004; 2005
- NaN
- NaN
- 1551-7012
- 1551-7004
-
-
- 1
- Ab Imperio
- Ab Imperio
- 2166-4072
- 2164-9731
- 2005; 2006; 2007; 2008; 2009; 2010; 2011; 2012...
- 2000; 2001; 2002; 2003; 2004; 2005; 2005; 2006...
- NaN
- 2020
- 2164-9731
- 2166-4072
-
-
- 2
- Absinthe Literary Review
- Absinthe Literary Review
- NaN
- 1939-0343
- NaN
- 2003; 2004; 2005
- NaN
- NaN
- 1939-0343
- 1939-0343
-
-
- 3
- Academy Health
- eGEMs
- NaN
- 2327-9214
- 1; 2; 2; 3; 4
- 2013; 2014; 2014; 2015; 2016
- NaN
- NaN
- 2327-9214
- 2327-9214
-
-
- 4
- Academy of American Franciscan History
- The Americas
- 0003-1615
- 1533-6247
- 57; 58; 59; 60; 61; 62; 63; 64; 65; 66; 67; 68...
- 2000; 2001; 2002; 2003; 2004; 2005; 2006; 2007...
- NaN
- NaN
- 1533-6247
- 0003-1615
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 14988
- Youngstown State University Center for Judaic ...
- Journal of Jewish Identities
- 1946-2522
- 1939-7941
- 1; 2; 3; 4; 5; 6; 7; 8
- 2008; 2009; 2010; 2011; 2012; 2013; 2014; 2015
- NaN
- NaN
- 1939-7941
- 1939-7941
-
-
- 14989
- Zoological Society of Japan
- Zoological Science
- 0289-0003
- NaN
- 12; 13; 14; 15; 16; 17; 18; 19; 20; 21; 22; 23...
- 1995; 1996; 1997; 1998; 1999; 2000; 2001; 2002...
- NaN
- NaN
- 0289-0003
- 0289-0003
-
-
- 14990
- Zoological Society of Southern Africa
- African Zoology
- 1562-7020
- 2224-073X
- 41; 42; 43; 44; 45; 46; 47; 48; 49; 50; 51; 52
- 2006; 2007; 2008; 2009; 2010; 2011; 2012; 2013...
- NaN
- NaN
- 2224-073X
- 1562-7020
-
-
- 14991
- eLife Sciences Publications
- eLife
- NaN
- 2050-084X
- NaN
- 2014; 2014; 2014; 2014; 2014; 2014; 2014; 2014...
- NaN
- NaN
- 2050-084X
- 2050-084X
-
-
- 14992
- frommann-holzboog
- Steiner Studies
- NaN
- 2698-217X
- NaN
- NaN
- 1
- 2020
- 2698-217X
- NaN
-
-
-
-
14993 rows × 10 columns
-
-
-
-
-
-```python
-lockss.columns
-```
-
-
-
-
- Index(['Publisher', 'Title', 'ISSN', 'eISSN', 'Preserved Volumes',
- 'Preserved Years', 'In Progress Volumes', 'In Progress Years', 'issn',
- 'issnl'],
- dtype='object')
-
-
-
-
-```python
-# test des lignes sans merge
-lockss.loc[lockss['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- Publisher
- Title
- ISSN
- eISSN
- Preserved Volumes
- Preserved Years
- In Progress Volumes
- In Progress Years
- issn
- issnl
-
-
-
-
- 5
- Academy of Management
- Academy of Management Discoveries (AMD)
- NaN
- 2168-1007
- 1; 2; 3
- 2015; 2016; 2017
- NaN
- NaN
- 2168-1007
- NaN
-
-
- 28
- Alliance of Crop, Soil, and Environmental Scie...
- Soil Horizons
- NaN
- 2163-2812
- 50; 51; 52; 53; 54; 55; 56
- 2009; 2010; 2011; 2012; 2013; 2014; 2015
- NaN
- NaN
- 2163-2812
- NaN
-
-
- 131
- American Institute of Aeronautics and Astronau...
- Air Traffic Control Quarterly
- 1064-3818
- 2472-5757
- 1; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15...
- 1993; 1995; 1996; 1997; 1998; 1999; 2000; 2001...
- 2
- 1994
- 2472-5757
- NaN
-
-
- 134
- American Institute of Aeronautics and Astronau...
- Journal of Air Transportation
- NaN
- 2380-9450
- 24; 25; 26; 27
- 2016; 2017; 2018; 2019
- 28
- 2020
- 2380-9450
- NaN
-
-
- 192
- American Psychiatric Association Publishing
- Psychiatric Research and Clinical Practice
- NaN
- 2575-5609
- 1
- 2019
- 2
- 2020
- 2575-5609
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 14900
- Utrecht University Library
- Early Modern Low Countries
- NaN
- 2543-1587
- NaN
- NaN
- 1; 2; 3; 4
- 2017; 2018; 2019; 2020
- 2543-1587
- NaN
-
-
- 14968
- White Rose University Press
- British and Irish Orthoptic Journal
- NaN
- 2516-3590
- 6; 7; 8; 9; 10; 11; 12; 13; 14; 16
- 2009; 2010; 2011; 2012; 2013; 2014; 2015; 2016...
- 17
- 2020
- 2516-3590
- NaN
-
-
- 14970
- White Rose University Press
- Undergraduate Journal of Politics and Internat...
- NaN
- 2398-5992
- 1; 2
- 2018; 2019
- NaN
- NaN
- 2398-5992
- NaN
-
-
- 14985
- World Haiku Club
- World Haiku Review
- NaN
- NaN
- 1; 2; 3
- 2001; 2002; 2003
- NaN
- NaN
- NaN
- NaN
-
-
- 14992
- frommann-holzboog
- Steiner Studies
- NaN
- 2698-217X
- NaN
- NaN
- 1
- 2020
- 2698-217X
- NaN
-
-
-
-
835 rows × 10 columns
-
-
-
-
-
-```python
-# utiliser l'ISSN à la place sur ces lignes
-lockss.loc[lockss['issnl'].isna(), 'issnl'] = lockss['issn']
-```
-
-
-```python
-# test des lignes sans merge
-lockss.loc[lockss['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- Publisher
- Title
- ISSN
- eISSN
- Preserved Volumes
- Preserved Years
- In Progress Volumes
- In Progress Years
- issn
- issnl
-
-
-
-
- 317
- Association des Amis des Cryptogames
- Cryptogamie, Algologie
- NaN
- NaN
- 32; 33; 34; 35; 36; 37; 38
- 2011; 2012; 2013; 2014; 2015; 2016; 2017
- NaN
- NaN
- NaN
- NaN
-
-
- 318
- Association des Amis des Cryptogames
- Cryptogamie, Bryologie
- NaN
- NaN
- 32; 33; 34; 35; 36; 37; 38
- 2011; 2012; 2013; 2014; 2015; 2016; 2017
- NaN
- NaN
- NaN
- NaN
-
-
- 319
- Association des Amis des Cryptogames
- Cryptogamie, Mycologie
- NaN
- NaN
- 32; 33; 34; 35; 36; 37; 38
- 2011; 2012; 2013; 2014; 2015; 2016; 2017
- NaN
- NaN
- NaN
- NaN
-
-
- 850
- Boston College Libraries
- Fresh Ink: Essays From Boston College's First-...
- NaN
- NaN
- 12; 13; 13; 9
- 2009; 2010; 2011; 2007
- NaN
- NaN
- NaN
- NaN
-
-
- 1681
- Exquisite Corpse
- Exquisite Corpse
- NaN
- NaN
- NaN
- 1999
- NaN
- NaN
- NaN
- NaN
-
-
- 2032
- Georgia Southern University
- Irish Studies South
- NaN
- NaN
- 1
- 2014
- NaN
- NaN
- NaN
- NaN
-
-
- 2039
- Georgia Southern University
- The Journal of Student Success in Writing
- NaN
- NaN
- 1
- 2017
- NaN
- NaN
- NaN
- NaN
-
-
- 3526
- LOCKSS Program
- LOCKSS Card
- NaN
- NaN
- NaN
- 2005; 2006; 2006; 2006
- NaN
- NaN
- NaN
- NaN
-
-
- 4721
- Oxford University Press
- International Immunology Meeting Abstracts
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 6725
- Sagamore Publishing
- Journal of Facility Planning, Design, and Mana...
- NaN
- NaN
- 1; 2; 3; 4
- 2013; 2014; 2015; 2016
- NaN
- NaN
- NaN
- NaN
-
-
- 10718
- State of Alaska
- Alaska State Documents
- NaN
- NaN
- NaN
- 2005; 2005; 2006; 2006; 2007; 2007; 2008; 2008...
- NaN
- NaN
- NaN
- NaN
-
-
- 14985
- World Haiku Club
- World Haiku Review
- NaN
- NaN
- 1; 2; 3
- 2001; 2002; 2003
- NaN
- NaN
- NaN
- NaN
-
-
-
-
-
-
-
-
-```python
-# ajout des infos de LOCKSS :
-# Title
-lockss_for_merge = lockss[['issnl', 'Title']]
-lockss_for_merge
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- Title
-
-
-
-
- 0
- 1551-7004
- ARKIVOC
-
-
- 1
- 2166-4072
- Ab Imperio
-
-
- 2
- 1939-0343
- Absinthe Literary Review
-
-
- 3
- 2327-9214
- eGEMs
-
-
- 4
- 0003-1615
- The Americas
-
-
- ...
- ...
- ...
-
-
- 14988
- 1939-7941
- Journal of Jewish Identities
-
-
- 14989
- 0289-0003
- Zoological Science
-
-
- 14990
- 1562-7020
- African Zoology
-
-
- 14991
- 2050-084X
- eLife
-
-
- 14992
- 2698-217X
- Steiner Studies
-
-
-
-
14993 rows × 2 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-lockss_for_merge = lockss_for_merge.rename(columns={'Title' : 'lockss_title'})
-lockss_for_merge
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- lockss_title
-
-
-
-
- 0
- 1551-7004
- ARKIVOC
-
-
- 1
- 2166-4072
- Ab Imperio
-
-
- 2
- 1939-0343
- Absinthe Literary Review
-
-
- 3
- 2327-9214
- eGEMs
-
-
- 4
- 0003-1615
- The Americas
-
-
- ...
- ...
- ...
-
-
- 14988
- 1939-7941
- Journal of Jewish Identities
-
-
- 14989
- 0289-0003
- Zoological Science
-
-
- 14990
- 1562-7020
- African Zoology
-
-
- 14991
- 2050-084X
- eLife
-
-
- 14992
- 2698-217X
- Steiner Studies
-
-
-
-
14993 rows × 2 columns
-
-
-
-
-
-```python
-# merge avec journals
-journals = pd.merge(journals, lockss_for_merge, on='issnl', how='left')
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
- lockss_title
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1
- Yes
- 1.0
- PLoS One
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1000
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 1001
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 1002
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
-
-
- 1003
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 1004
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
-
-
1005 rows × 15 columns
-
-
-
-
-
-```python
-# suppression des doublons
-journals = journals.drop_duplicates(subset=['id'])
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
- lockss_title
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1
- Yes
- 1.0
- PLoS One
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1000
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 1001
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 1002
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
-
-
- 1003
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
- 1004
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
-
-
-
-
996 rows × 15 columns
-
-
-
-
-
-```python
-# ajouter info sur la presence sur LOCKSS
-journals.loc[journals['lockss_title'].isna(), 'lockss'] = 0
-journals.loc[~journals['lockss_title'].isna(), 'lockss'] = 1
-journals
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py:376: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- self.obj[key] = _infer_fill_value(value)
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py:494: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- self.obj[item] = s
-
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
- lockss_title
- lockss
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1
- Yes
- 1.0
- PLoS One
- 1.0
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1000
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
-
-
- 1001
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
-
-
- 1002
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
-
-
- 1003
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
-
-
- 1004
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
-
-
-
-
996 rows × 16 columns
-
-
-
-
-### Portico
-
-
-```python
-# ajout des infos de preservation Portico
-portico = pd.read_excel('portico/e-journals.xlsx', sheet_name='Details', skiprows=2)
-portico
-```
-
-
-
-
-
-
-
-
-
-
- Publisher
- Title
- Society
- Print ISSN
- e-ISSN
- PCA
- Status
- Years
- ContentSet Id
- Holdings
- ...
- Unnamed: 13
- Unnamed: 14
- Unnamed: 15
- Unnamed: 16
- Unnamed: 17
- Unnamed: 18
- Unnamed: 19
- Unnamed: 20
- Unnamed: 21
- Unnamed: 22
-
-
-
-
- 0
- ACI Information Group (through 2018)
- ACI Information Group
- NaN
- NaN
- 2374-1406
- No
- preserved
- 2017-2018
- ACI Scholarly Blog Content
- 2017 - v. 2017 (January-December), 2018 - v. 2...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 1
- AECL Nuclear Review
- CNL Nuclear Review
- NaN
- 2369-6931
- 2369-6923
- Yes
- preserved
- 2016-2020
- ISSN_23696931
- 2016 - v. 5 (1-2), 2016/2017 - v. 6 (1-2), 201...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 2
- AECL Nuclear Review
- AECL Nuclear Review
- NaN
- 1929-8056
- 1929-6371
- Yes
- preserved
- 2014-2015
- ISSN_19298056
- 2014 - v. 1 (1-2), 2014 - v. 2 (1-2), 2014 - v...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 3
- AIP Publishing
- Low Temperature Physics
- NaN
- 1063-777X
- 1090-6517
- Yes
- preserved
- 1997-2021
- ISSN_1063777X
- 1997 - v. 23 (1-5, 7-12), 1998 - v. 24 (1-12),...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 4
- AIP Publishing
- Physics of Fluids A: Fluid Dynamics
- NaN
- 0899-8213
- NaN
- Yes
- preserved
- 1989-1993
- ISSN_08998213
- 1989 - v. 1 (1-12), 1990 - v. 2 (1-12), 1991 -...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 35550
- Zeal Press Ltd.
- International Journal of Robotics and Automati...
- NaN
- NaN
- 2409-9694
- NaN
- queued
- -
- ISSN_24099694_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 35551
- Zeal Press Ltd.
- Journal of Material Science and Technology Res...
- NaN
- NaN
- 2410-4701
- NaN
- queued
- -
- ISSN_24104701_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 35552
- Zeal Press Ltd.
- Journal of Modern Mechanical Engineering and T...
- NaN
- NaN
- 2409-9848
- NaN
- queued
- -
- ISSN_24099848_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 35553
- Zeal Press Ltd.
- Journal of Solar Energy Research Updates
- NaN
- NaN
- 2410-2199
- NaN
- queued
- -
- ISSN_24102199_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 35554
- icddr,b (through 2015)
- Journal of Health, Population and Nutrition (J...
- NaN
- 1606-0997
- NaN
- Yes
- preserved
- 2005-2015
- ISSN_16060997
- 2005 - v. 23 (3-4), 2006 - v. 24 (1-4), 2007 -...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
-
-
35555 rows × 23 columns
-
-
-
-
-
-```python
-# ajout ISSNL
-portico['issn'] = portico['e-ISSN']
-portico.loc[portico['e-ISSN'].isna(), 'issn'] = portico['Print ISSN']
-portico
-```
-
-
-
-
-
-
-
-
-
-
- Publisher
- Title
- Society
- Print ISSN
- e-ISSN
- PCA
- Status
- Years
- ContentSet Id
- Holdings
- ...
- Unnamed: 14
- Unnamed: 15
- Unnamed: 16
- Unnamed: 17
- Unnamed: 18
- Unnamed: 19
- Unnamed: 20
- Unnamed: 21
- Unnamed: 22
- issn
-
-
-
-
- 0
- ACI Information Group (through 2018)
- ACI Information Group
- NaN
- NaN
- 2374-1406
- No
- preserved
- 2017-2018
- ACI Scholarly Blog Content
- 2017 - v. 2017 (January-December), 2018 - v. 2...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2374-1406
-
-
- 1
- AECL Nuclear Review
- CNL Nuclear Review
- NaN
- 2369-6931
- 2369-6923
- Yes
- preserved
- 2016-2020
- ISSN_23696931
- 2016 - v. 5 (1-2), 2016/2017 - v. 6 (1-2), 201...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2369-6923
-
-
- 2
- AECL Nuclear Review
- AECL Nuclear Review
- NaN
- 1929-8056
- 1929-6371
- Yes
- preserved
- 2014-2015
- ISSN_19298056
- 2014 - v. 1 (1-2), 2014 - v. 2 (1-2), 2014 - v...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 1929-6371
-
-
- 3
- AIP Publishing
- Low Temperature Physics
- NaN
- 1063-777X
- 1090-6517
- Yes
- preserved
- 1997-2021
- ISSN_1063777X
- 1997 - v. 23 (1-5, 7-12), 1998 - v. 24 (1-12),...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 1090-6517
-
-
- 4
- AIP Publishing
- Physics of Fluids A: Fluid Dynamics
- NaN
- 0899-8213
- NaN
- Yes
- preserved
- 1989-1993
- ISSN_08998213
- 1989 - v. 1 (1-12), 1990 - v. 2 (1-12), 1991 -...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 0899-8213
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 35550
- Zeal Press Ltd.
- International Journal of Robotics and Automati...
- NaN
- NaN
- 2409-9694
- NaN
- queued
- -
- ISSN_24099694_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2409-9694
-
-
- 35551
- Zeal Press Ltd.
- Journal of Material Science and Technology Res...
- NaN
- NaN
- 2410-4701
- NaN
- queued
- -
- ISSN_24104701_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2410-4701
-
-
- 35552
- Zeal Press Ltd.
- Journal of Modern Mechanical Engineering and T...
- NaN
- NaN
- 2409-9848
- NaN
- queued
- -
- ISSN_24099848_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2409-9848
-
-
- 35553
- Zeal Press Ltd.
- Journal of Solar Energy Research Updates
- NaN
- NaN
- 2410-2199
- NaN
- queued
- -
- ISSN_24102199_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2410-2199
-
-
- 35554
- icddr,b (through 2015)
- Journal of Health, Population and Nutrition (J...
- NaN
- 1606-0997
- NaN
- Yes
- preserved
- 2005-2015
- ISSN_16060997
- 2005 - v. 23 (3-4), 2006 - v. 24 (1-4), 2007 -...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 1606-0997
-
-
-
-
35555 rows × 24 columns
-
-
-
-
-
-```python
-portico = pd.merge(portico, df_issnl, on='issn', how='left')
-portico
-```
-
-
-
-
-
-
-
-
-
-
- Publisher
- Title
- Society
- Print ISSN
- e-ISSN
- PCA
- Status
- Years
- ContentSet Id
- Holdings
- ...
- Unnamed: 15
- Unnamed: 16
- Unnamed: 17
- Unnamed: 18
- Unnamed: 19
- Unnamed: 20
- Unnamed: 21
- Unnamed: 22
- issn
- issnl
-
-
-
-
- 0
- ACI Information Group (through 2018)
- ACI Information Group
- NaN
- NaN
- 2374-1406
- No
- preserved
- 2017-2018
- ACI Scholarly Blog Content
- 2017 - v. 2017 (January-December), 2018 - v. 2...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2374-1406
- 2374-1406
-
-
- 1
- AECL Nuclear Review
- CNL Nuclear Review
- NaN
- 2369-6931
- 2369-6923
- Yes
- preserved
- 2016-2020
- ISSN_23696931
- 2016 - v. 5 (1-2), 2016/2017 - v. 6 (1-2), 201...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2369-6923
- NaN
-
-
- 2
- AECL Nuclear Review
- AECL Nuclear Review
- NaN
- 1929-8056
- 1929-6371
- Yes
- preserved
- 2014-2015
- ISSN_19298056
- 2014 - v. 1 (1-2), 2014 - v. 2 (1-2), 2014 - v...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 1929-6371
- 1929-8056
-
-
- 3
- AIP Publishing
- Low Temperature Physics
- NaN
- 1063-777X
- 1090-6517
- Yes
- preserved
- 1997-2021
- ISSN_1063777X
- 1997 - v. 23 (1-5, 7-12), 1998 - v. 24 (1-12),...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 1090-6517
- 1063-777X
-
-
- 4
- AIP Publishing
- Physics of Fluids A: Fluid Dynamics
- NaN
- 0899-8213
- NaN
- Yes
- preserved
- 1989-1993
- ISSN_08998213
- 1989 - v. 1 (1-12), 1990 - v. 2 (1-12), 1991 -...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 0899-8213
- 0899-8213
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 35550
- Zeal Press Ltd.
- International Journal of Robotics and Automati...
- NaN
- NaN
- 2409-9694
- NaN
- queued
- -
- ISSN_24099694_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2409-9694
- 2409-9694
-
-
- 35551
- Zeal Press Ltd.
- Journal of Material Science and Technology Res...
- NaN
- NaN
- 2410-4701
- NaN
- queued
- -
- ISSN_24104701_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2410-4701
- 2410-4701
-
-
- 35552
- Zeal Press Ltd.
- Journal of Modern Mechanical Engineering and T...
- NaN
- NaN
- 2409-9848
- NaN
- queued
- -
- ISSN_24099848_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2409-9848
- 2409-9848
-
-
- 35553
- Zeal Press Ltd.
- Journal of Solar Energy Research Updates
- NaN
- NaN
- 2410-2199
- NaN
- queued
- -
- ISSN_24102199_1023
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2410-2199
- 2410-2199
-
-
- 35554
- icddr,b (through 2015)
- Journal of Health, Population and Nutrition (J...
- NaN
- 1606-0997
- NaN
- Yes
- preserved
- 2005-2015
- ISSN_16060997
- 2005 - v. 23 (3-4), 2006 - v. 24 (1-4), 2007 -...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 1606-0997
- 1606-0997
-
-
-
-
35555 rows × 25 columns
-
-
-
-
-
-```python
-portico.columns
-```
-
-
-
-
- Index(['Publisher', 'Title', 'Society', 'Print ISSN', 'e-ISSN', 'PCA',
- 'Status', 'Years', 'ContentSet Id', 'Holdings', 'Unnamed: 10',
- 'Unnamed: 11', 'Unnamed: 12', 'Unnamed: 13', 'Unnamed: 14',
- 'Unnamed: 15', 'Unnamed: 16', 'Unnamed: 17', 'Unnamed: 18',
- 'Unnamed: 19', 'Unnamed: 20', 'Unnamed: 21', 'Unnamed: 22', 'issn',
- 'issnl'],
- dtype='object')
-
-
-
-
-```python
-# test des lignes sans merge
-portico.loc[portico['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- Publisher
- Title
- Society
- Print ISSN
- e-ISSN
- PCA
- Status
- Years
- ContentSet Id
- Holdings
- ...
- Unnamed: 15
- Unnamed: 16
- Unnamed: 17
- Unnamed: 18
- Unnamed: 19
- Unnamed: 20
- Unnamed: 21
- Unnamed: 22
- issn
- issnl
-
-
-
-
- 1
- AECL Nuclear Review
- CNL Nuclear Review
- NaN
- 2369-6931
- 2369-6923
- Yes
- preserved
- 2016-2020
- ISSN_23696931
- 2016 - v. 5 (1-2), 2016/2017 - v. 6 (1-2), 201...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2369-6923
- NaN
-
-
- 9
- AIP Publishing
- APL Bioengineering
- NaN
- NaN
- 2473-2877
- Yes
- preserved
- 2017-2021
- ISSN_247342877
- 2017 - v. 1 (1), 2018 - v. 2 (1-4), 2019 - v. ...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2473-2877
- NaN
-
-
- 14
- AIP Publishing
- Biophysics Reviews
- NaN
- NaN
- 2688-4089
- Yes
- preserved
- 2020-2021
- ISSN_26884089_15
- 2020 - v. 1 (1), 2021 - v. 2 (1)
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2688-4089
- NaN
-
-
- 16
- AIP Publishing
- Journal of Undergraduate Reports in Physics
- NaN
- NaN
- 2642-7451
- Yes
- preserved
- 2018-2020
- ISSN_26427451_15
- 2018 - v. 28 (1), 2019 - v. 29 (1), 2020 - v. ...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2642-7451
- NaN
-
-
- 20
- AIP Publishing
- Nanotechnology and Precision Engineering
- NaN
- 1672-6030
- 2589-5540
- NaN
- preserved
- 2018-2021
- ISSN_16726030_15
- 2018 - v. 1 (1-4), 2019 - v. 2 (1-4), 2020 - v...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2589-5540
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 35539
- World Scientific
- Division of Labor & Transaction Costs
- NaN
- 0219-8711
- 1793-7000
- No
- preserved
- 2005-2011
- ISSN_02198711
- 2005/2006 - v. 1 (1-2), 2006/2007 - v. 2 (1-2)...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 1793-7000
- NaN
-
-
- 35540
- World Scientific
- Journal of Medical Robotics Research
- NaN
- 2424-905X
- 2424-9068
- No
- preserved
- 2016-2020
- ISSN_2424905X
- 2016 - v. 1 (1-4), 2017 - v. 2 (1-4), 2018 - v...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2424-9068
- NaN
-
-
- 35541
- World Scientific
- International Journal of Foundations of Comput...
- NaN
- 0129-0541
- 1793-6373
- No
- preserved
- 1990-2021
- ISSN_01290541
- 1990 - v. 1 (1-4), 1991 - v. 2 (1-4), 1992 - v...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 1793-6373
- NaN
-
-
- 35542
- World Scientific
- Molecular Frontiers Journal
- NaN
- 2529-7325
- 2529-7333
- No
- preserved
- 2017-2020
- ISSN_25297325
- 2017 - v. 1 (1-2, null), 2018 - v. 2 (1), 2019...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2529-7333
- NaN
-
-
- 35543
- World Scientific
- Water Economics and Policy
- NaN
- 2382-624X
- 2382-6258
- No
- preserved
- 2015-2020
- ISSN_2382624X
- 2015 - v. 1 (1-4), 2016 - v. 2 (1-4), 2017 - v...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- 2382-6258
- NaN
-
-
-
-
4086 rows × 25 columns
-
-
-
-
-
-```python
-# utiliser l'ISSN à la place sur ces lignes
-portico.loc[portico['issnl'].isna(), 'issnl'] = portico['issn']
-```
-
-
-```python
-# test des lignes sans merge
-portico.loc[portico['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- Publisher
- Title
- Society
- Print ISSN
- e-ISSN
- PCA
- Status
- Years
- ContentSet Id
- Holdings
- ...
- Unnamed: 15
- Unnamed: 16
- Unnamed: 17
- Unnamed: 18
- Unnamed: 19
- Unnamed: 20
- Unnamed: 21
- Unnamed: 22
- issn
- issnl
-
-
-
-
- 41
- ASTM International
- ASTM Standards
- NaN
- NaN
- NaN
- Yes
- queued
- -
- ASTM Standards
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 58
- Academic Journals
- International Journal of Vocational and Techni...
- NaN
- NaN
- NaN
- NaN
- queued
- -
- ISSN_TBD70
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 78
- Academic Journals
- Journal of Metabolomics and Systems Biology
- NaN
- NaN
- NaN
- NaN
- queued
- -
- ISSN_TBD68
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 180
- Academy of Research
- The Microfinance Journal
- NaN
- NaN
- NaN
- NaN
- queued
- -
- TBD_MJ_1242
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 254
- African Online Scientific Information Systems ...
- Journal of African Foresight
- NaN
- NaN
- NaN
- NaN
- queued
- -
- ISSN_TBD288
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 34911
- Wolters Kluwer Health
- AJSP Open
- NaN
- NaN
- NaN
- Yes
- queued
- -
- TBD_74_1
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 34915
- Wolters Kluwer Health
- Annals of Surgery OA
- NaN
- NaN
- NaN
- Yes
- queued
- -
- TBD_74_2
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 35047
- Wolters Kluwer Health
- Otology & Neurotology Open
- NaN
- NaN
- NaN
- Yes
- queued
- -
- TBD_ONO_74
- -
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 35058
- Wolters Kluwer Health
- Northwest Journal of Optometry
- NaN
- NaN
- NaN
- Yes
- preserved
- 1924-1925
- NJO_74
- v.1(1-12),v.2(1-7)
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 35209
- Wolters Kluwer Health
- Occupational Therapy & Rehabilitation
- NaN
- NaN
- NaN
- Yes
- preserved
- 1925-1951
- OTR_74
- v.22(1-6),v.23(1-6),v.24(1-6),v.25(1-6),v.26(1...
- ...
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
-
-
300 rows × 25 columns
-
-
-
-
-
-```python
-# ajout des infos de Portico :
-# Status
-portico_for_merge = portico[['issnl', 'Status']]
-portico_for_merge
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- Status
-
-
-
-
- 0
- 2374-1406
- preserved
-
-
- 1
- 2369-6923
- preserved
-
-
- 2
- 1929-8056
- preserved
-
-
- 3
- 1063-777X
- preserved
-
-
- 4
- 0899-8213
- preserved
-
-
- ...
- ...
- ...
-
-
- 35550
- 2409-9694
- queued
-
-
- 35551
- 2410-4701
- queued
-
-
- 35552
- 2409-9848
- queued
-
-
- 35553
- 2410-2199
- queued
-
-
- 35554
- 1606-0997
- preserved
-
-
-
-
35555 rows × 2 columns
-
-
-
-
-
-```python
-# garder les lignes "preserved"
-portico_for_merge = portico_for_merge.loc[portico_for_merge['Status'] == 'preserved']
-portico_for_merge
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- Status
-
-
-
-
- 0
- 2374-1406
- preserved
-
-
- 1
- 2369-6923
- preserved
-
-
- 2
- 1929-8056
- preserved
-
-
- 3
- 1063-777X
- preserved
-
-
- 4
- 0899-8213
- preserved
-
-
- ...
- ...
- ...
-
-
- 35546
- 2572-5505
- preserved
-
-
- 35547
- 2225-0719
- preserved
-
-
- 35548
- 2472-0712
- preserved
-
-
- 35549
- 2377-231X
- preserved
-
-
- 35554
- 1606-0997
- preserved
-
-
-
-
33177 rows × 2 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-portico_for_merge = portico_for_merge.rename(columns={'Status' : 'portico_status'})
-portico_for_merge
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- portico_status
-
-
-
-
- 0
- 2374-1406
- preserved
-
-
- 1
- 2369-6923
- preserved
-
-
- 2
- 1929-8056
- preserved
-
-
- 3
- 1063-777X
- preserved
-
-
- 4
- 0899-8213
- preserved
-
-
- ...
- ...
- ...
-
-
- 35546
- 2572-5505
- preserved
-
-
- 35547
- 2225-0719
- preserved
-
-
- 35548
- 2472-0712
- preserved
-
-
- 35549
- 2377-231X
- preserved
-
-
- 35554
- 1606-0997
- preserved
-
-
-
-
33177 rows × 2 columns
-
-
-
-
-
-```python
-# merge avec journals
-journals = pd.merge(journals, portico_for_merge, on='issnl', how='left')
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1
- Yes
- 1.0
- PLoS One
- 1.0
- NaN
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1077
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
-
-
- 1078
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
-
-
- 1079
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
-
-
- 1080
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
-
-
- 1081
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
-
-
1082 rows × 17 columns
-
-
-
-
-
-```python
-# suppression des doublons
-journals = journals.drop_duplicates(subset=['id'])
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1
- Yes
- 1.0
- PLoS One
- 1.0
- NaN
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1076
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
-
-
- 1077
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
-
-
- 1078
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
-
-
- 1080
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
-
-
- 1081
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
-
-
996 rows × 17 columns
-
-
-
-
-
-```python
-# ajouter info sur la presence sur portico
-journals.loc[journals['portico_status'].isna(), 'portico'] = 0
-journals.loc[~journals['portico_status'].isna(), 'portico'] = 1
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1
- Yes
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1076
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
-
-
- 1077
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
-
-
- 1078
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
-
-
- 1080
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
-
-
- 1081
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
-
-
-
-
996 rows × 18 columns
-
-
-
-
-### Licences Nationales
-
-
-```python
-# ajout des infos de preservation des Licences nationales
-nlch1 = pd.read_excel('licences_nationales/cambridge_Switzerland_NationalLicences_2020-08-17.xlsx')
-nlch1
-```
-
-
-
-
-
-
-
-
-
-
- publication_title
- print_identifier
- online_identifier
- date_first_issue_online
- num_first_vol_online
- num_first_issue_online
- date_last_issue_online
- num_last_vol_online
- num_last_issue_online
- title_url
- ...
- publisher_name
- publication_type
- date_monograph_published_print
- date_monograph_published_online
- monograph_volume
- monograph_edition
- first_editor
- parent_publication_title_id
- preceding_publication_title_id
- access_type
-
-
-
-
- 0
- Journal of Agricultural and Applied Economics
- 1074-0708
- NaN
- 1969
- 1.0
- NaN
- 2015
- 47.0
- NaN
- http://www.cambridge.org/core/product/identifi...
- ...
- Cambridge University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 1
- Advances in Applied Mathematics and Mechanics
- 2070-0733
- 2075-1354
- 2011
- 3.0
- NaN
- 2015
- 8.0
- NaN
- http://www.cambridge.org/core/product/identifi...
- ...
- Cambridge University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 2
- Annals of Actuarial Science
- 1748-4995
- 1748-5002
- 2006
- 1.0
- NaN
- 2015
- 9.0
- NaN
- http://www.cambridge.org/core/product/identifi...
- ...
- Cambridge University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 3
- Advances in Animal Biosciences
- 2040-4700
- 2040-4719
- 2010
- 1.0
- NaN
- 2015
- 6.0
- NaN
- http://www.cambridge.org/core/product/identifi...
- ...
- Cambridge University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 4
- Archaeologia
- 0261-3409
- NaN
- 1770
- 1.0
- NaN
- 1992
- 110.0
- NaN
- http://www.cambridge.org/core/product/identifi...
- ...
- Cambridge University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 389
- Zygote
- 0967-1994
- 1469-8730
- 1993
- 1.0
- NaN
- 2015
- 23.0
- NaN
- http://www.cambridge.org/core/product/identifi...
- ...
- Cambridge University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 390
- Political Analysis
- 1047-1987
- 1476-4989
- 1989
- 1.0
- NaN
- 2015
- 23.0
- NaN
- http://www.cambridge.org/core/product/identifi...
- ...
- Cambridge University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 391
- Business and Politics
- 1369-5258
- 1469-3569
- 1999
- 1.0
- NaN
- 2015
- 17.0
- NaN
- http://www.cambridge.org/core/product/identifi...
- ...
- Cambridge University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 392
- Transactions of the Institute of Actuaries
- 2047-2838
- 2398-7383
- 1849
- 1.0
- NaN
- 1852
- 1.0
- NaN
- http://www.cambridge.org/core/product/identifi...
- ...
- Cambridge University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 393
- Transactions of the International Astronomical...
- NaN
- 0251-107X
- 1922
- 1.0
- 1.0
- 2007
- 25.0
- 2.0
- https://www.cambridge.org/core/journals/procee...
- ...
- Cambridge University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
-
-
394 rows × 25 columns
-
-
-
-
-
-```python
-# ajout des infos de preservation des Licences nationales
-nlch2 = pd.read_excel('licences_nationales/gruyter_Switzerland_NationalLicences_2020-11-30.xlsx')
-nlch2
-```
-
-
-
-
-
-
-
-
-
-
- publication_title
- print_identifier
- online_identifier
- date_first_issue_online
- num_first_vol_online
- num_first_issue_online
- date_last_issue_online
- num_last_vol_online
- num_last_issue_online
- title_url
- ...
- publisher_name
- publication_type
- date_monograph_published_print
- date_monograph_published_online
- monograph_volume
- monograph_edition
- first_editor
- parent_publication_title_id
- preceding_publication_title_id
- access_type
-
-
-
-
- 0
- ABI Technik
- 0720-6763
- 2191-4664
- 1996
- 16
- NaN
- 2017
- 37.0
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- ...
- De Gruyter
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- P
-
-
- 1
- Accounting, Economics, and Law: A Convivium
- 2194-6051
- 2152-2820
- 2011
- 1
- NaN
- 2017
- 7.0
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- ...
- De Gruyter
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- P
-
-
- 2
- Advanced Optical Technologies
- 2192-8576
- 2192-8584
- 2012
- 1
- NaN
- 2017
- 6.0
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- ...
- De Gruyter
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- P
-
-
- 3
- Advances in Calculus of Variations
- 1864-8258
- 1864-8266
- 2008
- 1
- NaN
- 2017
- 10.0
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- ...
- De Gruyter
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- P
-
-
- 4
- Advances in Geometry
- 1615-715X
- 1615-7168
- 2001
- 1
- NaN
- 2017
- 17.0
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- ...
- De Gruyter
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- P
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 339
- Zeitschrift für Religionswissenschaft
- 0943-8610
- 2194-508X
- 1993
- 1
- NaN
- 2017
- 25.0
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- ...
- De Gruyter
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- P
-
-
- 340
- Zeitschrift für romanische Philologie
- 0049-8661
- 1865-9063
- 1877
- 1
- NaN
- 2017
- 133.0
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- ...
- De Gruyter
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- P
-
-
- 341
- Zeitschrift für Slawistik
- 0044-3506
- 2196-7016
- 1956
- 1
- NaN
- 2017
- 62.0
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- ...
- De Gruyter (A)
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- P
-
-
- 342
- Zeitschrift für Sprachwissenschaft
- 0721-9067
- 1613-3706
- 1982
- 1
- NaN
- 2017
- 36.0
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- ...
- De Gruyter
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- F
-
-
- 343
- Zeitschrift für Unternehmens- und Gesellschaft...
- 0340-2479
- 1612-7048
- 1972
- 1
- NaN
- 2017
- 46.0
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- ...
- De Gruyter
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- P
-
-
-
-
344 rows × 25 columns
-
-
-
-
-
-```python
-# ajout des infos de preservation des Licences nationales
-nlch3 = pd.read_excel('licences_nationales/oxford_Switzerland_NationalLicences_2020-09-24.xlsx')
-nlch3
-```
-
-
-
-
-
-
-
-
-
-
- publication_title
- print_identifier
- online_identifier
- date_first_issue_online
- num_first_vol_online
- num_first_issue_online
- date_last_issue_online
- num_last_vol_online
- num_last_issue_online
- title_url
- ...
- publisher_name
- publication_type
- date_monograph_published_print
- date_monograph_published_online
- monograph_volume
- monograph_edition
- first_editor
- parent_publication_title_id
- preceding_publication_title_id
- access_type
-
-
-
-
- 0
- Acta Biochimica et Biophysica Sinica
- 1672-9145
- 1745-7270
- 2015
- 47.0
- NaN
- 2018
- NaN
- NaN
- https://academic.oup.com/abbs
- ...
- Oxford University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 1
- Archives of Clinical Neuropsychology
- 0887-6177
- 1873-5843
- 1986
- 1.0
- NaN
- 2018
- NaN
- NaN
- https://academic.oup.com/acn
- ...
- Oxford University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 2
- Adaptation
- 1755-0637
- 1755-0645
- 2015
- 8.0
- NaN
- 2018
- NaN
- NaN
- https://academic.oup.com/adaptation
- ...
- Oxford University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 3
- American Entomologist
- 1046-2821
- 2155-9902
- 1990
- 36.0
- NaN
- 2018
- NaN
- NaN
- https://academic.oup.com/ae
- ...
- Oxford University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 4
- Applied Economic Perspectives and Policy
- 1058-7195
- 1467-9353
- 1988
- 1.0
- NaN
- 2018
- NaN
- NaN
- https://academic.oup.com/aepp
- ...
- Oxford University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 343
- The Chinese Journal of Comparative Law
- 2050-4802
- 2050-4810
- 2018
- 6.0
- NaN
- 2018
- NaN
- NaN
- https://academic.oup.com/cjcl
- ...
- Oxford University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 344
- Journal of Nutrition
- 0022-3166
- 1541-6100
- 2018
- 148.0
- NaN
- 2018
- NaN
- NaN
- https://academic.oup.com/jn
- ...
- Oxford University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 345
- Translational Behavioral Medicine
- 1869-6716
- 1613-9860
- 2018
- 8.0
- NaN
- 2018
- NaN
- NaN
- https://academic.oup.com/tbm
- ...
- Oxford University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 346
- The Western Historical Quarterly
- 0043-3810
- 1939-8603
- 2016
- 47.0
- NaN
- 2018
- NaN
- NaN
- https://academic.oup.com/whq
- ...
- Oxford University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 347
- Zoological Journal of the Linnean Society
- 0024-4082
- 1096-3642
- 2017
- 179.0
- NaN
- 2018
- NaN
- NaN
- https://academic.oup.com/zoolinnean
- ...
- Oxford University Press
- serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
-
-
348 rows × 25 columns
-
-
-
-
-
-```python
-# ajout des infos de preservation des Licences nationales
-nlch4 = pd.read_excel('licences_nationales/springer_Switzerland_NationalLicences_2020-08-12.xlsx')
-nlch4
-```
-
-
-
-
-
-
-
-
-
-
- publication_title
- print_identifier
- online_identifier
- date_first_issue_online
- num_first_vol_online
- num_first_issue_online
- date_last_issue_online
- num_last_vol_online
- num_last_issue_online
- title_url
- ...
- coverage_notes
- publisher_name
- publication_type
- date_monograph_published_print
- date_monograph_published_online
- monograph_volume
- monograph_edition
- first_editor
- parent_publication_title_id
- preceding_publication_title_id
-
-
-
-
- 0
- 4OR
- 1619-4500
- 1614-2411
- 2005
- 3.0
- 1.0
- 2015
- NaN
- NaN
- http://link.springer.com/journal/10288
- ...
- NaN
- Springer Berlin Heidelberg
- Serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 1
- AAPS PharmSciTech
- NaN
- 1530-9932
- 2005
- 6.0
- 1.0
- 2015
- NaN
- NaN
- http://link.springer.com/journal/12249
- ...
- NaN
- Springer US
- Serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 2
- ADHD Attention Deficit and Hyperactivity Disor...
- 1866-6116
- 1866-6647
- 2009
- 1.0
- 1.0
- 2014
- NaN
- NaN
- http://link.springer.com/journal/12402
- ...
- NaN
- Springer Vienna
- Serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 3
- AI & SOCIETY
- 0951-5666
- 1435-5655
- 1987
- 1.0
- 1.0
- 2015
- NaN
- NaN
- http://link.springer.com/journal/146
- ...
- NaN
- Springer London
- Serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 4
- AIDS and Behavior
- 1090-7165
- 1573-3254
- 2005
- 9.0
- 1.0
- 2015
- NaN
- NaN
- http://link.springer.com/journal/10461
- ...
- NaN
- Springer US
- Serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1667
- neurogenetics
- 1364-6745
- 1364-6753
- 2005
- 6.0
- 1.0
- 2015
- NaN
- NaN
- http://link.springer.com/journal/10048
- ...
- NaN
- Springer Berlin Heidelberg
- Serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 1668
- uwf UmweltWirtschaftsForum | Sustainability Ma...
- 0943-3481
- 1432-2293
- 2007
- 15.0
- 1.0
- 2015
- NaN
- NaN
- http://link.springer.com/journal/550
- ...
- NaN
- Springer Berlin Heidelberg
- Serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 1669
- Österreichische Wasser- und Abfallwirtschaft
- 0945-358X
- 1613-7566
- 2005
- 57.0
- 1.0
- 2015
- NaN
- NaN
- http://link.springer.com/journal/506
- ...
- NaN
- Springer Vienna
- Serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 1670
- Österreichische Zeitschrift für Soziologie
- 1011-0070
- 1862-2585
- 2005
- 30.0
- 1.0
- 2015
- NaN
- NaN
- http://link.springer.com/journal/11614
- ...
- NaN
- Springer Fachmedien Wiesbaden
- Serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 1671
- Journal Applied Mathematics Computing
- 1598-5865
- 1865-2085
- 1905
- NaN
- NaN
- 1905
- NaN
- NaN
- http://link.springer.com/journal/12190
- ...
- NaN
- Springer
- Serial
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
-
-
1672 rows × 24 columns
-
-
-
-
-
-```python
-# concatener les 4
-nlch = pd.concat([nlch1, nlch2, nlch3, nlch4], ignore_index=True)
-nlch
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
- of pandas will change to not sort by default.
-
- To accept the future behavior, pass 'sort=False'.
-
- To retain the current behavior and silence the warning, pass 'sort=True'.
-
-
-
-
-
-
-
-
-
-
-
-
-
- access_type
- coverage_depth
- coverage_notes
- date_first_issue_online
- date_last_issue_online
- date_monograph_published_online
- date_monograph_published_print
- embargo_info
- first_author
- first_editor
- ...
- num_last_vol_online
- online_identifier
- parent_publication_title_id
- preceding_publication_title_id
- print_identifier
- publication_title
- publication_type
- publisher_name
- title_id
- title_url
-
-
-
-
- 0
- NaN
- fulltext
- NaN
- 1969
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 47.0
- NaN
- NaN
- NaN
- 1074-0708
- Journal of Agricultural and Applied Economics
- serial
- Cambridge University Press
- aae
- http://www.cambridge.org/core/product/identifi...
-
-
- 1
- NaN
- fulltext
- NaN
- 2011
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 8.0
- 2075-1354
- NaN
- NaN
- 2070-0733
- Advances in Applied Mathematics and Mechanics
- serial
- Cambridge University Press
- aam
- http://www.cambridge.org/core/product/identifi...
-
-
- 2
- NaN
- fulltext
- NaN
- 2006
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 9.0
- 1748-5002
- NaN
- NaN
- 1748-4995
- Annals of Actuarial Science
- serial
- Cambridge University Press
- aas
- http://www.cambridge.org/core/product/identifi...
-
-
- 3
- NaN
- fulltext
- NaN
- 2010
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 6.0
- 2040-4719
- NaN
- NaN
- 2040-4700
- Advances in Animal Biosciences
- serial
- Cambridge University Press
- abs
- http://www.cambridge.org/core/product/identifi...
-
-
- 4
- NaN
- fulltext
- NaN
- 1770
- 1992
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 110.0
- NaN
- NaN
- NaN
- 0261-3409
- Archaeologia
- serial
- Cambridge University Press
- ach
- http://www.cambridge.org/core/product/identifi...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 2753
- NaN
- fulltext
- NaN
- 2005
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- 1364-6753
- NaN
- NaN
- 1364-6745
- neurogenetics
- Serial
- Springer Berlin Heidelberg
- 10048
- http://link.springer.com/journal/10048
-
-
- 2754
- NaN
- fulltext
- NaN
- 2007
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- 1432-2293
- NaN
- NaN
- 0943-3481
- uwf UmweltWirtschaftsForum | Sustainability Ma...
- Serial
- Springer Berlin Heidelberg
- 550
- http://link.springer.com/journal/550
-
-
- 2755
- NaN
- fulltext
- NaN
- 2005
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- 1613-7566
- NaN
- NaN
- 0945-358X
- Österreichische Wasser- und Abfallwirtschaft
- Serial
- Springer Vienna
- 506
- http://link.springer.com/journal/506
-
-
- 2756
- NaN
- fulltext
- NaN
- 2005
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- 1862-2585
- NaN
- NaN
- 1011-0070
- Österreichische Zeitschrift für Soziologie
- Serial
- Springer Fachmedien Wiesbaden
- 11614
- http://link.springer.com/journal/11614
-
-
- 2757
- NaN
- fulltext
- NaN
- 1905
- 1905
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- 1865-2085
- NaN
- NaN
- 1598-5865
- Journal Applied Mathematics Computing
- Serial
- Springer
- 12190
- http://link.springer.com/journal/12190
-
-
-
-
2758 rows × 26 columns
-
-
-
-
-
-```python
-nlch.columns
-```
-
-
-
-
- Index(['access_type', 'coverage_depth', 'coverage_notes',
- 'date_first_issue_online', 'date_last_issue_online',
- 'date_monograph_published_online', 'date_monograph_published_print',
- 'embargo_info', 'first_author', 'first_editor', 'monograph_edition',
- 'monograph_volume', 'notes', 'num_first_issue_online',
- 'num_first_vol_online', 'num_last_issue_online', 'num_last_vol_online',
- 'online_identifier', 'parent_publication_title_id',
- 'preceding_publication_title_id', 'print_identifier',
- 'publication_title', 'publication_type', 'publisher_name', 'title_id',
- 'title_url'],
- dtype='object')
-
-
-
-
-```python
-# ajout ISSNL
-nlch['issn'] = nlch['online_identifier']
-nlch.loc[nlch['online_identifier'].isna(), 'issn'] = nlch['print_identifier']
-nlch
-```
-
-
-
-
-
-
-
-
-
-
- access_type
- coverage_depth
- coverage_notes
- date_first_issue_online
- date_last_issue_online
- date_monograph_published_online
- date_monograph_published_print
- embargo_info
- first_author
- first_editor
- ...
- online_identifier
- parent_publication_title_id
- preceding_publication_title_id
- print_identifier
- publication_title
- publication_type
- publisher_name
- title_id
- title_url
- issn
-
-
-
-
- 0
- NaN
- fulltext
- NaN
- 1969
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- NaN
- 1074-0708
- Journal of Agricultural and Applied Economics
- serial
- Cambridge University Press
- aae
- http://www.cambridge.org/core/product/identifi...
- 1074-0708
-
-
- 1
- NaN
- fulltext
- NaN
- 2011
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 2075-1354
- NaN
- NaN
- 2070-0733
- Advances in Applied Mathematics and Mechanics
- serial
- Cambridge University Press
- aam
- http://www.cambridge.org/core/product/identifi...
- 2075-1354
-
-
- 2
- NaN
- fulltext
- NaN
- 2006
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 1748-5002
- NaN
- NaN
- 1748-4995
- Annals of Actuarial Science
- serial
- Cambridge University Press
- aas
- http://www.cambridge.org/core/product/identifi...
- 1748-5002
-
-
- 3
- NaN
- fulltext
- NaN
- 2010
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 2040-4719
- NaN
- NaN
- 2040-4700
- Advances in Animal Biosciences
- serial
- Cambridge University Press
- abs
- http://www.cambridge.org/core/product/identifi...
- 2040-4719
-
-
- 4
- NaN
- fulltext
- NaN
- 1770
- 1992
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- NaN
- 0261-3409
- Archaeologia
- serial
- Cambridge University Press
- ach
- http://www.cambridge.org/core/product/identifi...
- 0261-3409
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 2753
- NaN
- fulltext
- NaN
- 2005
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 1364-6753
- NaN
- NaN
- 1364-6745
- neurogenetics
- Serial
- Springer Berlin Heidelberg
- 10048
- http://link.springer.com/journal/10048
- 1364-6753
-
-
- 2754
- NaN
- fulltext
- NaN
- 2007
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 1432-2293
- NaN
- NaN
- 0943-3481
- uwf UmweltWirtschaftsForum | Sustainability Ma...
- Serial
- Springer Berlin Heidelberg
- 550
- http://link.springer.com/journal/550
- 1432-2293
-
-
- 2755
- NaN
- fulltext
- NaN
- 2005
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 1613-7566
- NaN
- NaN
- 0945-358X
- Österreichische Wasser- und Abfallwirtschaft
- Serial
- Springer Vienna
- 506
- http://link.springer.com/journal/506
- 1613-7566
-
-
- 2756
- NaN
- fulltext
- NaN
- 2005
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 1862-2585
- NaN
- NaN
- 1011-0070
- Österreichische Zeitschrift für Soziologie
- Serial
- Springer Fachmedien Wiesbaden
- 11614
- http://link.springer.com/journal/11614
- 1862-2585
-
-
- 2757
- NaN
- fulltext
- NaN
- 1905
- 1905
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- 1865-2085
- NaN
- NaN
- 1598-5865
- Journal Applied Mathematics Computing
- Serial
- Springer
- 12190
- http://link.springer.com/journal/12190
- 1865-2085
-
-
-
-
2758 rows × 27 columns
-
-
-
-
-
-```python
-nlch = pd.merge(nlch, df_issnl, on='issn', how='left')
-nlch
-```
-
-
-
-
-
-
-
-
-
-
- access_type
- coverage_depth
- coverage_notes
- date_first_issue_online
- date_last_issue_online
- date_monograph_published_online
- date_monograph_published_print
- embargo_info
- first_author
- first_editor
- ...
- parent_publication_title_id
- preceding_publication_title_id
- print_identifier
- publication_title
- publication_type
- publisher_name
- title_id
- title_url
- issn
- issnl
-
-
-
-
- 0
- NaN
- fulltext
- NaN
- 1969
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1074-0708
- Journal of Agricultural and Applied Economics
- serial
- Cambridge University Press
- aae
- http://www.cambridge.org/core/product/identifi...
- 1074-0708
- 1074-0708
-
-
- 1
- NaN
- fulltext
- NaN
- 2011
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2070-0733
- Advances in Applied Mathematics and Mechanics
- serial
- Cambridge University Press
- aam
- http://www.cambridge.org/core/product/identifi...
- 2075-1354
- 2070-0733
-
-
- 2
- NaN
- fulltext
- NaN
- 2006
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1748-4995
- Annals of Actuarial Science
- serial
- Cambridge University Press
- aas
- http://www.cambridge.org/core/product/identifi...
- 1748-5002
- 1748-4995
-
-
- 3
- NaN
- fulltext
- NaN
- 2010
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2040-4700
- Advances in Animal Biosciences
- serial
- Cambridge University Press
- abs
- http://www.cambridge.org/core/product/identifi...
- 2040-4719
- 2040-4700
-
-
- 4
- NaN
- fulltext
- NaN
- 1770
- 1992
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0261-3409
- Archaeologia
- serial
- Cambridge University Press
- ach
- http://www.cambridge.org/core/product/identifi...
- 0261-3409
- 0261-3409
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 2753
- NaN
- fulltext
- NaN
- 2005
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1364-6745
- neurogenetics
- Serial
- Springer Berlin Heidelberg
- 10048
- http://link.springer.com/journal/10048
- 1364-6753
- 1364-6745
-
-
- 2754
- NaN
- fulltext
- NaN
- 2007
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0943-3481
- uwf UmweltWirtschaftsForum | Sustainability Ma...
- Serial
- Springer Berlin Heidelberg
- 550
- http://link.springer.com/journal/550
- 1432-2293
- 0943-3481
-
-
- 2755
- NaN
- fulltext
- NaN
- 2005
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0945-358X
- Österreichische Wasser- und Abfallwirtschaft
- Serial
- Springer Vienna
- 506
- http://link.springer.com/journal/506
- 1613-7566
- 0945-358X
-
-
- 2756
- NaN
- fulltext
- NaN
- 2005
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1011-0070
- Österreichische Zeitschrift für Soziologie
- Serial
- Springer Fachmedien Wiesbaden
- 11614
- http://link.springer.com/journal/11614
- 1862-2585
- 1011-0070
-
-
- 2757
- NaN
- fulltext
- NaN
- 1905
- 1905
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1598-5865
- Journal Applied Mathematics Computing
- Serial
- Springer
- 12190
- http://link.springer.com/journal/12190
- 1865-2085
- 1598-5865
-
-
-
-
2758 rows × 28 columns
-
-
-
-
-
-```python
-# test des lignes sans merge
-nlch.loc[nlch['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- access_type
- coverage_depth
- coverage_notes
- date_first_issue_online
- date_last_issue_online
- date_monograph_published_online
- date_monograph_published_print
- embargo_info
- first_author
- first_editor
- ...
- parent_publication_title_id
- preceding_publication_title_id
- print_identifier
- publication_title
- publication_type
- publisher_name
- title_id
- title_url
- issn
- issnl
-
-
-
-
- 37
- NaN
- fulltext
- NaN
- 1959
- 2006
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1357-7298
- Animal science
- serial
- Cambridge University Press
- asc
- http://www.cambridge.org/core/product/identifi...
- 1748-748X
- NaN
-
-
- 52
- NaN
- fulltext
- NaN
- 1957
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2055-7973
- British Catholic History
- serial
- Cambridge University Press
- bch
- http://www.cambridge.org/core/product/identifi...
- 2055-7981
- NaN
-
-
- 76
- NaN
- fulltext
- NaN
- 1882
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1750-2705
- Cambridge Classical Journal
- serial
- Cambridge University Press
- ccj
- http://www.cambridge.org/core/product/identifi...
- 2047-993X
- NaN
-
-
- 110
- NaN
- fulltext
- NaN
- 2011
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2079-7362
- East Asian Journal on Applied Mathematics
- serial
- Cambridge University Press
- eam
- http://www.cambridge.org/core/product/identifi...
- 2079-7370
- NaN
-
-
- 152
- NaN
- fulltext
- NaN
- 1980
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2051-5367
- Hegel Bulletin
- serial
- Cambridge University Press
- hgl
- http://www.cambridge.org/core/product/identifi...
- 2051-5375
- NaN
-
-
- 194
- NaN
- fulltext
- NaN
- 1991
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2055-6365
- Journal of Psychologists and Counsellors in Sc...
- serial
- Cambridge University Press
- jgc
- http://www.cambridge.org/core/product/identifi...
- 2055-6373
- NaN
-
-
- 200
- NaN
- fulltext
- NaN
- 1911
- 1993
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2049-9299
- Journal of the Staple Inn Actuarial Society
- serial
- Cambridge University Press
- jis
- http://www.cambridge.org/core/product/identifi...
- 2059-6162
- NaN
-
-
- 267
- NaN
- fulltext
- NaN
- 2009
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0016-7746
- Netherlands Journal of Geosciences / Geologie ...
- serial
- Cambridge University Press
- njg
- http://www.cambridge.org/core/product/identifi...
- 1573-9708
- NaN
-
-
- 278
- NaN
- fulltext
- NaN
- 2008
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- NaN
- Australasian Journal of Organisational Psychology
- serial
- Cambridge University Press
- orp
- http://www.cambridge.org/core/product/identifi...
- 2054-2232
- NaN
-
-
- 375
- NaN
- fulltext
- NaN
- 1788
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1755-6910
- Earth and environmental science transactions o...
- serial
- Royal Society of Edinburgh Scotland Foundation
- tre
- http://www.cambridge.org/core/product/identifi...
- 1755-6929
- NaN
-
-
- 405
- P
- fulltext
- NaN
- 1855
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0341-289X
- Annalen des Historischen Vereins für den Niede...
- yearbook
- Böhlau Verlag
- 2194-3818
- https://www.degruyter.com/openurl?genre=journa...
- 2194-3818
- NaN
-
-
- 411
- P
- fulltext
- NaN
- 1955
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0066-6297
- Archiv für Diplomatik, Schriftgeschichte, Sieg...
- yearbook
- Böhlau Verlag
- 2194-5020
- https://www.degruyter.com/openurl?genre=journa...
- 2194-5020
- NaN
-
-
- 413
- P
- fulltext
- NaN
- 1903
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0003-9233
- Archiv für Kulturgeschichte
- serial
- Böhlau Verlag
- 2194-3958
- https://www.degruyter.com/openurl?genre=journa...
- 2194-3958
- NaN
-
-
- 418
- P
- fulltext
- NaN
- 1876
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0003-9497
- Archivalische Zeitschrift
- serial
- Böhlau Verlag
- 2194-3826
- https://www.degruyter.com/openurl?genre=journa...
- 2194-3826
- NaN
-
-
- 427
- P
- fulltext
- NaN
- 1948
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0006-2456
- Bildung und Erziehung
- serial
- Böhlau Verlag
- 2194-3834
- https://www.degruyter.com/openurl?genre=journa...
- 2194-3834
- NaN
-
-
- 458
- P
- fulltext
- NaN
- 1867
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0070-444X
- Deutsches Dante-Jahrbuch
- yearbook
- De Gruyter
- 2194-4059
- https://www.degruyter.com/openurl?genre=journa...
- 2194-4059
- NaN
-
-
- 468
- P
- fulltext
- NaN
- 1994
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2566-9095
- Etruscan and Italic Studies
- serial
- De Gruyter
- 2566-9109
- https://www.degruyter.com/openurl?genre=journa...
- 2566-9109
- NaN
-
-
- 479
- P
- fulltext
- NaN
- 2005
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2567-4765
- FinanzRundschau
- serial
- Verlag Dr. Otto Schmidt
- 2567-4897
- https://www.degruyter.com/openurl?genre=journa...
- 2567-4897
- NaN
-
-
- 530
- P
- fulltext
- NaN
- 1969
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0074-9818
- Internationales Jahrbuch der Erwachsenenbildung
- yearbook
- Böhlau Verlag
- 2194-3699
- https://www.degruyter.com/openurl?genre=journa...
- 2194-3699
- NaN
-
-
- 537
- P
- fulltext
- NaN
- 1912
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0341-9320
- Jahrbuch des Kölnischen Geschichtsvereins
- yearbook
- Böhlau Verlag
- 2198-0675
- https://www.degruyter.com/openurl?genre=journa...
- 2198-0675
- NaN
-
-
- 561
- P
- fulltext
- NaN
- 2012
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2194-6345
- Journal of Econometric Methods
- serial
- De Gruyter
- 2156-6674
- https://www.degruyter.com/openurl?genre=journa...
- 2156-6674
- NaN
-
-
- 570
- F
- fulltext
- NaN
- 1977
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2567-9430
- Journal of Laboratory Medicine
- serial
- De Gruyter
- 2567-9449
- https://www.degruyter.com/openurl?genre=journa...
- 2567-9449
- NaN
-
-
- 675
- P
- fulltext
- NaN
- 1950
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0080-5319
- Saeculum
- serial
- Böhlau Verlag
- 2194-4075
- https://www.degruyter.com/openurl?genre=journa...
- 2194-4075
- NaN
-
-
- 708
- P
- fulltext
- NaN
- 2005
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2363-4774
- World Political Science
- serial
- De Gruyter
- 2363-4782
- https://www.degruyter.com/openurl?genre=journa...
- 2363-4782
- NaN
-
-
- 709
- P
- fulltext
- NaN
- 2014
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2196-6249
- Yearbook for European Jewish Literature Studies
- yearbook
- De Gruyter
- 2196-6257
- https://www.degruyter.com/openurl?genre=journa...
- 2196-6257
- NaN
-
-
- 712
- P
- fulltext
- NaN
- 1861
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- NaN
- Zeitschrift der Savigny-Stiftung für Rechtsges...
- serial
- NaN
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- NaN
- NaN
-
-
- 713
- P
- fulltext
- NaN
- 1911
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- NaN
- Zeitschrift der Savigny-Stiftung für Rechtsges...
- serial
- NaN
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- NaN
- NaN
-
-
- 714
- P
- fulltext
- NaN
- 1880
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- NaN
- Zeitschrift der Savigny-Stiftung für Rechtsges...
- serial
- NaN
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- NaN
- NaN
-
-
- 766
- NaN
- fulltext
- NaN
- 2015
- 2018
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2041-2649
- Briefings in Functional Genomics
- serial
- Oxford University Press
- bfgp
- https://academic.oup.com/bfgp
- 2041-2647
- NaN
-
-
- 890
- NaN
- fulltext
- NaN
- 1922
- 2018
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0021-924X
- The Journal of Biochemistry
- serial
- Oxford University Press
- jbchem
- https://academic.oup.com/jb
- -
- NaN
-
-
- 926
- NaN
- fulltext
- NaN
- 1889
- 2018
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0024-2160
- The Library
- serial
- Oxford University Press
- libraj
- https://academic.oup.com/library
- -
- NaN
-
-
- 1010
- NaN
- fulltext
- NaN
- 1977
- 1992
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0148-0847
- Social Work Research and Abstracts
- serial
- Oxford University Press
- swra
- https://academic.oup.com/swra
- 1001-3412
- NaN
-
-
- 1057
- NaN
- fulltext
- NaN
- 2017
- 2018
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0021-972X
- The Journal of Clinical Endocrinology & Metabo...
- serial
- Oxford University Press
- jcem
- https://academic.oup.com/jcem
- 1845-7197
- NaN
-
-
- 1074
- NaN
- fulltext
- NaN
- 2018
- 2018
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2398-4910
- Perspectives on Public Management and Governance
- serial
- Oxford University Press
- ppmg
- https://academic.oup.com/ppmg
- 2398-4929
- NaN
-
-
- 1094
- NaN
- fulltext
- NaN
- 1976
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 2366-004X
- Abdominal Radiology
- Serial
- Springer US
- 261
- http://link.springer.com/journal/261
- 2366-0058
- NaN
-
-
- 1105
- NaN
- fulltext
- NaN
- 1982
- 1985
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0253-486X
- Geochemistry
- Serial
- Science Press
- 11631
- http://link.springer.com/journal/11631
-
- NaN
-
-
- 1148
- NaN
- fulltext
- NaN
- 1975
- 2004
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1066-2316
- American Journal of Criminal Justice
- Serial
- Springer US
- 12103
- http://link.springer.com/journal/12103
- 1936-1351
- NaN
-
-
- 1218
- NaN
- fulltext
- NaN
- 2006
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1862-3522
- Archives of Osteoporosis
- Serial
- Springer London
- 11657
- http://link.springer.com/journal/11657
- 1862-3514
- NaN
-
-
- 1363
- NaN
- fulltext
- NaN
- 1995
- 2002
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1006-6497
- Chinese journal of integrated traditional and ...
- Serial
- Springer Berlin Heidelberg
- 11655
- http://link.springer.com/journal/11655
-
- NaN
-
-
- 1365
- NaN
- fulltext
- NaN
- 2009
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0256-7679
- Chinese Journal of Polymer Science
- Serial
- Chinese Chemical Society and Institute of Chem...
- 10118
- http://link.springer.com/journal/10118
- 1439-6203
- NaN
-
-
- 1382
- NaN
- fulltext
- NaN
- 1983
- 1994
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0731-8235
- Clinical reviews in allergy
- Serial
- Springer US
- 12016
- http://link.springer.com/journal/12016
-
- NaN
-
-
- 1383
- NaN
- fulltext
- NaN
- 1982
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0770-3198
- Clinical Rheumatology
- Serial
- Springer London
- 10067
- http://link.springer.com/journal/10067
- 1434-9949
- NaN
-
-
- 1938
- NaN
- fulltext
- NaN
- 2008
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1936-1521
- Journal of Child & Adolescent Trauma
- Serial
- Springer International Publishing
- 40653
- http://link.springer.com/journal/40653
- 1936-153X
- NaN
-
-
- 2003
- NaN
- fulltext
- NaN
- 1986
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0884-8734
- Journal of General Internal Medicine
- Serial
- Springer US
- 11606
- http://link.springer.com/journal/11606
- 1525-1497
- NaN
-
-
- 2136
- NaN
- fulltext
- NaN
- 2006
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 1009-6124
- Journal of Systems Science and Complexity
- Serial
- Academy of Mathematics and Systems Science, Ch...
- 11424
- http://link.springer.com/journal/11424
- 1559-7067
- NaN
-
-
- 2255
- NaN
- fulltext
- NaN
- 1974
- 2015
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0095-3628
- Microbial Ecology
- Serial
- Springer US
- 248
- http://link.springer.com/journal/248
- 1432-184X
- NaN
-
-
- 2355
- NaN
- fulltext
- NaN
- 1992
- 1995
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- 0941-2530
- Orthopedics and Traumatology
- Serial
- Urban & Vogel
- 65
- http://link.springer.com/journal/65
- 1617-3838
- NaN
-
-
- 2674
- NaN
- fulltext
- NaN
- 1883
- 1887
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- NaN
- Transactions of the Academy of Medicine in Ire...
- Serial
- Springer-Verlag
- 12680
- http://link.springer.com/journal/12680
- NaN
- NaN
-
-
-
-
48 rows × 28 columns
-
-
-
-
-
-```python
-# utiliser l'ISSN à la place sur ces lignes
-nlch.loc[nlch['issnl'].isna(), 'issnl'] = nlch['issn']
-```
-
-
-```python
-# test des lignes sans merge
-nlch.loc[nlch['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- access_type
- coverage_depth
- coverage_notes
- date_first_issue_online
- date_last_issue_online
- date_monograph_published_online
- date_monograph_published_print
- embargo_info
- first_author
- first_editor
- ...
- parent_publication_title_id
- preceding_publication_title_id
- print_identifier
- publication_title
- publication_type
- publisher_name
- title_id
- title_url
- issn
- issnl
-
-
-
-
- 712
- P
- fulltext
- NaN
- 1861
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- NaN
- Zeitschrift der Savigny-Stiftung für Rechtsges...
- serial
- NaN
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- NaN
- NaN
-
-
- 713
- P
- fulltext
- NaN
- 1911
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- NaN
- Zeitschrift der Savigny-Stiftung für Rechtsges...
- serial
- NaN
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- NaN
- NaN
-
-
- 714
- P
- fulltext
- NaN
- 1880
- 2017
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- NaN
- Zeitschrift der Savigny-Stiftung für Rechtsges...
- serial
- NaN
- NaN
- https://www.degruyter.com/openurl?genre=journa...
- NaN
- NaN
-
-
- 2674
- NaN
- fulltext
- NaN
- 1883
- 1887
- NaN
- NaN
- NaN
- NaN
- NaN
- ...
- NaN
- NaN
- NaN
- Transactions of the Academy of Medicine in Ire...
- Serial
- Springer-Verlag
- 12680
- http://link.springer.com/journal/12680
- NaN
- NaN
-
-
-
-
4 rows × 28 columns
-
-
-
-
-
-```python
-# ajout des infos de nlch :
-# publication_title
-nlch_for_merge = nlch[['issnl', 'publication_title']]
-nlch_for_merge
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- publication_title
-
-
-
-
- 0
- 1074-0708
- Journal of Agricultural and Applied Economics
-
-
- 1
- 2070-0733
- Advances in Applied Mathematics and Mechanics
-
-
- 2
- 1748-4995
- Annals of Actuarial Science
-
-
- 3
- 2040-4700
- Advances in Animal Biosciences
-
-
- 4
- 0261-3409
- Archaeologia
-
-
- ...
- ...
- ...
-
-
- 2753
- 1364-6745
- neurogenetics
-
-
- 2754
- 0943-3481
- uwf UmweltWirtschaftsForum | Sustainability Ma...
-
-
- 2755
- 0945-358X
- Österreichische Wasser- und Abfallwirtschaft
-
-
- 2756
- 1011-0070
- Österreichische Zeitschrift für Soziologie
-
-
- 2757
- 1598-5865
- Journal Applied Mathematics Computing
-
-
-
-
2758 rows × 2 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-nlch_for_merge = nlch_for_merge.rename(columns={'publication_title' : 'nlch_title'})
-nlch_for_merge
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- nlch_title
-
-
-
-
- 0
- 1074-0708
- Journal of Agricultural and Applied Economics
-
-
- 1
- 2070-0733
- Advances in Applied Mathematics and Mechanics
-
-
- 2
- 1748-4995
- Annals of Actuarial Science
-
-
- 3
- 2040-4700
- Advances in Animal Biosciences
-
-
- 4
- 0261-3409
- Archaeologia
-
-
- ...
- ...
- ...
-
-
- 2753
- 1364-6745
- neurogenetics
-
-
- 2754
- 0943-3481
- uwf UmweltWirtschaftsForum | Sustainability Ma...
-
-
- 2755
- 0945-358X
- Österreichische Wasser- und Abfallwirtschaft
-
-
- 2756
- 1011-0070
- Österreichische Zeitschrift für Soziologie
-
-
- 2757
- 1598-5865
- Journal Applied Mathematics Computing
-
-
-
-
2758 rows × 2 columns
-
-
-
-
-
-```python
-# merge avec journals
-journals = pd.merge(journals, nlch_for_merge, on='issnl', how='left')
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1
- Yes
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 993
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
-
-
- 994
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
-
-
- 995
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
-
-
- 996
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
-
-
- 997
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
-
-
998 rows × 19 columns
-
-
-
-
-
-```python
-# ajouter info sur la presence sur portico
-journals.loc[journals['nlch_title'].isna(), 'nlch'] = 0
-journals.loc[~journals['nlch_title'].isna(), 'nlch'] = 1
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1
- Yes
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
- 0.0
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 993
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
-
-
- 994
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
-
-
- 995
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
-
-
- 996
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
-
-
- 997
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
-
-
-
-
998 rows × 20 columns
-
-
-
-
-### QOAM
-
-
-```python
-# ouverture du fichier
-qoam = pd.read_csv('qoam/qoam_not_zero.tsv', encoding='utf-8', header=0, sep='\t')
-qoam
-```
-
-
-
-
-
-
-
-
-
-
- issn
- qoam_av_score
-
-
-
-
- 0
- 2254-5883
- 5.0
-
-
- 1
- 2279-7254
- 5.0
-
-
- 2
- 2317-3076
- 5.0
-
-
- 3
- 2525-3468
- 5.0
-
-
- 4
- 1339-8474
- 5.0
-
-
- ...
- ...
- ...
-
-
- 3018
- 2083-4810
- 1.0
-
-
- 3019
- 1759-2208
- 1.0
-
-
- 3020
- 0219-9874
- 1.0
-
-
- 3021
- 2083-6139
- 1.0
-
-
- 3022
- 2312-2757
- 1.0
-
-
-
-
3023 rows × 2 columns
-
-
-
-
-
-```python
-qoam = pd.merge(qoam, df_issnl, on='issn', how='left')
-qoam
-```
-
-
-
-
-
-
-
-
-
-
- issn
- qoam_av_score
- issnl
-
-
-
-
- 0
- 2254-5883
- 5.0
- 2254-5883
-
-
- 1
- 2279-7254
- 5.0
- 2279-7254
-
-
- 2
- 2317-3076
- 5.0
- 2317-3076
-
-
- 3
- 2525-3468
- 5.0
- 2525-3468
-
-
- 4
- 1339-8474
- 5.0
- 1339-8474
-
-
- ...
- ...
- ...
- ...
-
-
- 3018
- 2083-4810
- 1.0
- 2083-4810
-
-
- 3019
- 1759-2208
- 1.0
- 1759-2208
-
-
- 3020
- 0219-9874
- 1.0
- 0219-9874
-
-
- 3021
- 2083-6139
- 1.0
- 2083-6139
-
-
- 3022
- 2312-2757
- 1.0
- 2312-2757
-
-
-
-
3023 rows × 3 columns
-
-
-
-
-
-```python
-# test des lignes sans merge
-qoam.loc[qoam['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- issn
- qoam_av_score
- issnl
-
-
-
-
- 24
- 2163-1182
- 4.50
- NaN
-
-
- 73
- 2292-1354
- 4.00
- NaN
-
-
- 77
- 2571-5135
- 4.00
- NaN
-
-
- 90
- 2201-568X
- 4.00
- NaN
-
-
- 302
- 1687-921X
- 3.50
- NaN
-
-
- 405
- 2391-5412
- 3.25
- NaN
-
-
- 438
- 2668-0572
- 3.25
- NaN
-
-
- 801
- 2391-5420
- 3.00
- NaN
-
-
- 803
- 2391-5447
- 3.00
- NaN
-
-
- 814
- 2391-5455
- 3.00
- NaN
-
-
- 815
- 2391-5471
- 3.00
- NaN
-
-
- 1100
- 2516-3159
- 2.75
- NaN
-
-
- 1216
- 2289-5639
- 2.50
- NaN
-
-
- 1228
- 2211-3835
- 2.50
- NaN
-
-
- 1506
- 1658-3558
- 2.25
- NaN
-
-
- 1550
- 2214-6296
- 2.25
- NaN
-
-
- 1960
- 1687-5257
- 2.00
- NaN
-
-
- 1975
- 1687-5699
- 2.00
- NaN
-
-
- 2140
- 2056-3315
- 2.00
- NaN
-
-
- 2150
- 2083-3636
- 2.00
- NaN
-
-
- 2189
- 2366-0058
- 1.75
- NaN
-
-
- 2198
- 2450-6966
- 1.75
- NaN
-
-
- 2254
- 1308-6979
- 1.75
- NaN
-
-
- 2267
- 1035-7680
- 1.75
- NaN
-
-
- 2300
- 2411-9660
- 1.75
- NaN
-
-
- 2611
- 2198-2627
- 1.25
- NaN
-
-
- 2804
- 2180-2726
- 1.00
- NaN
-
-
- 2979
- 2146-0574
- 1.00
- NaN
-
-
-
-
-
-
-
-
-```python
-# utiliser l'ISSN à la place sur ces lignes
-qoam.loc[qoam['issnl'].isna(), 'issnl'] = qoam['issn']
-```
-
-
-```python
-# test des lignes sans merge
-qoam.loc[qoam['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- issn
- qoam_av_score
- issnl
-
-
-
-
-
-
-
-
-
-
-```python
-# ajout des infos de qoam :
-# publication_title
-qoam_for_merge = qoam[['issnl', 'qoam_av_score']]
-qoam_for_merge
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- qoam_av_score
-
-
-
-
- 0
- 2254-5883
- 5.0
-
-
- 1
- 2279-7254
- 5.0
-
-
- 2
- 2317-3076
- 5.0
-
-
- 3
- 2525-3468
- 5.0
-
-
- 4
- 1339-8474
- 5.0
-
-
- ...
- ...
- ...
-
-
- 3018
- 2083-4810
- 1.0
-
-
- 3019
- 1759-2208
- 1.0
-
-
- 3020
- 0219-9874
- 1.0
-
-
- 3021
- 2083-6139
- 1.0
-
-
- 3022
- 2312-2757
- 1.0
-
-
-
-
3023 rows × 2 columns
-
-
-
-
-
-```python
-# merge avec journals
-journals = pd.merge(journals, qoam_for_merge, on='issnl', how='left')
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- ...
- doaj_seal
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- ...
- 1
- Yes
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
- 0.0
- 4.035714
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 995
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
-
-
- 996
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
-
-
- 997
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- ...
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
-
-
- 998
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
-
-
- 999
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
-
-
1000 rows × 21 columns
-
-
-
-
-
-```python
-# suppression des doublons
-journals = journals.drop_duplicates(subset=['id'])
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- ...
- doaj_seal
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- ...
- 1
- Yes
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
- 0.0
- 4.035714
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 995
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
-
-
- 996
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
-
-
- 997
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- ...
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
-
-
- 998
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
-
-
- 999
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- ...
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
-
-
-
-
996 rows × 21 columns
-
-
-
-
-## Finalisation de la table journals
-
-
-```python
-# test des doublons
-journals_doublons = journals[['issn', 'issnl', 'title']].loc[journals.duplicated(subset='issnl')].sort_values(by='issnl')
-journals_doublons
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- title
-
-
-
-
- 92
- 1520-5126
- 0002-7863
- Journal of the American Chemical Society (Online)
-
-
- 393
- 1520-6882
- 0003-2700
- Analytical chemistry (Online)
-
-
- 310
- 1077-3118
- 0003-6951
- Applied physics letters (Online)
-
-
- 167
- 1432-0746
- 0004-6361
- Astronomy & astrophysics (Online)
-
-
- 793
- 1542-0086
- 0006-3495
- Biophysical journal (Online)
-
-
- ...
- ...
- ...
- ...
-
-
- 426
- 2050-7496
- 2050-7496
- Journal of materials chemistry. A (Online)
-
-
- 952
- 2050-7534
- 2050-7526
- Journal of materials chemistry. C (Online)
-
-
- 83
- 2469-9969
- 2469-9950
- Physical review. B. (Online)
-
-
- 209
- 2470-0029
- 2470-0010
- Physical review. D. (Online)
-
-
- 840
- 2470-0053
- 2470-0045
- Physical review. E (Online)
-
-
-
-
85 rows × 3 columns
-
-
-
-
-
-```python
-journals_doublons = journals_doublons.loc[journals_doublons['issnl'].notna()]
-```
-
-
-```python
-# merge pour voir les lignes avec doublon
-journals_doublons['doublon_issnl'] = 1
-journals = pd.merge(journals, journals_doublons[['issnl', 'doublon_issnl']], on='issnl', how='left')
-journals.loc[journals['doublon_issnl'] == 1]
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- ...
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
- doublon_issnl
-
-
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 5
- 6
- 0003-6951
- 0003-6951
- Applied physics letters
- 1962
- 9999
- http://scitation.aip.org/aplo/
- Appl. phys. lett.
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 6
- 7
- 1029-8479
- 1029-8479
- The journal of high energy physics (Online)
- 1997
- 9999
- http://link.springer.com/journal/13130
- J. high energy phys. (Online)
- 124
- 83
- ...
- No
- 1.0
- Journal of High Energy Physics
- 1.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 7
- 8
- 0002-7863
- 0002-7863
- Journal of the American Chemical Society (Print)
- 1879
- 9999
- http://pubs.acs.org/journals/jacsat/index.html
- J. Am. Chem. Soc. (Print)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 944
- 950
- 1520-5207
- 1520-5207
- The journal of physical chemistry. B (1997 : O...
- 1997
- 9999
- http://pubs.acs.org/journals/jpcbfk/index.html
- J. phys. chem., B (1997 : Online)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 946
- 952
- 1361-6528
- 0957-4484
- Nanotechnology (Bristol. Online)
- 1990
- 9999
- http://www.iop.org/Journals/na
- Nanotechnology (Bristol, Online)
- 124
- 234
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 947
- 953
- 1469-7645
- 0022-1120
- Journal of fluid mechanics (Online)
- 1956
- 9999
- http://firstsearch.oclc.org
- J. fluid mech. (Online)
- 124
- 234
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- Journal of Fluid Mechanics
- 1.0
- NaN
- 1.0
-
-
- 948
- 954
- 2050-7534
- 2050-7526
- Journal of materials chemistry. C (Online)
- 2013
- 9999
- http://pubs.rsc.org/en/journals/journalissues/tc#
- J. mater. chem. C (Online)
- 124
- 234
- ...
- NaN
- 0.0
- Journal of Materials Chemistry C: Materials fo...
- 1.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 974
- 980
- 1477-0970
- 1352-4585
- Multiple sclerosis (Online)
- 1995
- 9999
- http://www.arnoldpublishers.com/journals/pages...
- Mult. scler. (Online)
- 124
- 234
- ...
- NaN
- 0.0
- Multiple Sclerosis Journal
- 1.0
- preserved
- 1.0
- NaN
- 0.0
- 1.75
- 1.0
-
-
-
-
170 rows × 22 columns
-
-
-
-
-
-```python
-journals.loc[journals['doublon_issnl'] == 1].sort_values(by='issnl')
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- ...
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
- doublon_issnl
-
-
-
-
- 7
- 8
- 0002-7863
- 0002-7863
- Journal of the American Chemical Society (Print)
- 1879
- 9999
- http://pubs.acs.org/journals/jacsat/index.html
- J. Am. Chem. Soc. (Print)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 92
- 93
- 1520-5126
- 0002-7863
- Journal of the American Chemical Society (Online)
- 1879
- 9999
- http://books.google.com/books?id=ExsEZbIZKjwC
- J. Am. Chem. Soc. (Online)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 393
- 396
- 1520-6882
- 0003-2700
- Analytical chemistry (Online)
- 1947
- 9999
- http://pubs.acs.org/journals/ancham/about.html
- Anal. chem. (Online)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 69
- 70
- 0003-2700
- 0003-2700
- Analytical chemistry (Washington)
- 1948
- 9999
- http://pubs.acs.org/journals/ancham/index.html
- Anal. chem. (Wash.)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 5
- 6
- 0003-6951
- 0003-6951
- Applied physics letters
- 1962
- 9999
- http://scitation.aip.org/aplo/
- Appl. phys. lett.
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40
- 41
- 2469-9950
- 2469-9950
- Physical review. B
- 2016
- 9999
- http://journals.aps.org/prb
- Phys. rev. B.
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 79
- 80
- 2470-0010
- 2470-0010
- Physical review. D
- 2016
- 9999
- http://journals.aps.org/prd
- Phys. rev. D.
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 209
- 210
- 2470-0029
- 2470-0010
- Physical review. D. (Online)
- 2016
- 9999
- http://journals.aps.org/prd
- Phys. rev. D. (Online)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 530
- 533
- 2470-0045
- 2470-0045
- Physical review. E (Print)
- 2016
- 9999
- http://journals.aps.org/pre
- Phys. rev., E (Print)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 836
- 842
- 2470-0053
- 2470-0045
- Physical review. E (Online)
- 2016
- 9999
- http://journals.aps.org/pre
- Phys. rev., E (Online)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
-
-
170 rows × 22 columns
-
-
-
-
-
-```python
-# export csv des doublons
-journals.loc[journals['doublon_issnl'] == 1].sort_values(by='issnl').to_csv('sample/journals_duplicates.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel des doublons
-journals.loc[journals['doublon_issnl'] == 1].sort_values(by='issnl').to_excel('sample/journals_duplicates.xlsx', index=False)
-```
-
-
-```python
-# suppression des doublons
-journals = journals.drop_duplicates(subset=['issnl'])
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- ...
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
- doublon_issnl
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- ...
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- ...
- Yes
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
- 0.0
- 4.035714
- NaN
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- ...
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 991
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
-
-
- 992
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
-
-
- 993
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- ...
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
- NaN
-
-
- 994
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- ...
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
-
-
- 995
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
-
-
-
-
911 rows × 22 columns
-
-
-
-
-
-```python
-# ajout du oa_status
-# 6 : Diamond
-# 5 : Gold
-# 4 : Full
-# 3 : Hybrid
-# 2 : Green
-# 1 : UNKNOWN
-journals['oa_status'] = 1
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- ...
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
- doublon_issnl
- oa_status
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- ...
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- ...
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
- 0.0
- 4.035714
- NaN
- 1
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- ...
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 991
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 992
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 993
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- ...
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
- NaN
- 1
-
-
- 994
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 995
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
-
-
911 rows × 23 columns
-
-
-
-
-
-```python
-# status 5 pour les revues DOAJ
-journals.loc[journals['doaj_status'] == 1, 'oa_status'] = 5
-# status 6 pour les revues DOAJ avec APC = 0
-journals.loc[(journals['doaj_status'] == 1) & (journals['APC'] == 'No'), 'oa_status'] = 6
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- ...
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
- doublon_issnl
- oa_status
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- ...
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- ...
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
- 0.0
- 4.035714
- NaN
- 5
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- ...
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 991
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 992
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 993
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- ...
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
- NaN
- 1
-
-
- 994
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 995
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
-
-
911 rows × 23 columns
-
-
-
-
-
-```python
-journals['oa_status'].value_counts()
-```
-
-
-
-
- 1 824
- 5 70
- 6 17
- Name: oa_status, dtype: int64
-
-
-
-
-```python
-# export csv brut
-journals.to_csv('sample/journals_brut.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel brut
-journals.to_excel('sample/journals_brut.xlsx', index=False)
-```
-
-
-```python
-# export csv des ids
-journals[['id', 'title', 'issn', 'issnl']].to_csv('sample/journals_ids.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel des ids
-journals[['id', 'title', 'issn', 'issnl']].to_excel('sample/journals_ids.xlsx', index=False)
-```
-
-
-```python
-
-```
diff --git a/import_scripts/03_oacct_journals.py b/import_scripts/03_oacct_journals.py
deleted file mode 100644
index 3b63a7b5..00000000
--- a/import_scripts/03_oacct_journals.py
+++ /dev/null
@@ -1,1062 +0,0 @@
-#!/usr/bin/env python
-# coding: utf-8
-
-# # Projet Open Access Compliance Check Tool (OACCT)
-#
-# Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-#
-# Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-#
-# Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-# Date de dernière mise à jour : 16.07.2021
-
-# ## Extraction des données des revues
-#
-#
-# ## Corpus initial
-#
-# ISSNs des revues des publication archivées sur l'AoU UNIGE et sur Infoscience EPFL
-#
-# * Fichier des ISSNs de l'AoU exporté le 16.10.2020
-# * Fichier des ISSNs de Infoscience exporté le 28.01.2021
-# * Données extraits à partir du JSON de ISSN.org
-#
-
-# In[1]:
-
-
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-# paramètre pour le nombre de journaux dans le sample (0 pour prendre tout)
-journals_sample_n = 1000
-
-
-# ## Table OA categories
-#
-# * 1 : UNKNOWN
-# * 2 : Green
-# * 3 : Hybrid
-# * 4 : Full
-# * 5 : Gold
-# * 6 : Diamond
-
-# In[2]:
-
-
-# creation du DF
-col_names = ['id',
- 'status',
- 'description',
- 'subscription',
- 'accepted_manuscript',
- 'apc',
- 'final_version'
- ]
-oas = pd.DataFrame(columns = col_names)
-oas
-
-
-# In[3]:
-
-
-# ajout des valeurs
-oas = oas.append({'id' : 1, 'status' : 'UNKNOWN', 'description' : '', 'subscription' : 0, 'accepted_manuscript' : 0, 'apc' : 0, 'final_version' : 0}, ignore_index=True)
-oas = oas.append({'id' : 2, 'status' : 'Green', 'description' : 'Paywalled access journal, usually allows the archive of submitted or accepted version on institutional repositories (embargo periods may apply)', 'subscription' : 1, 'accepted_manuscript' : 1, 'apc' : 0, 'final_version' : 0}, ignore_index=True)
-oas = oas.append({'id' : 3, 'status' : 'hybrid', 'description' : 'Paywalled access journal, offers several Open Access upon payment of APCs. It allows offten the archive of published version on institutional repositories (embargo periods can apply)', 'subscription' : 1, 'accepted_manuscript' : 1, 'apc' : 1, 'final_version' : 1}, ignore_index=True)
-# oas = oas.append({'id' : 4, 'status' : 'Full', 'description' : 'No subscription, Green or Gold', 'subscription' : 0, 'accepted_manuscript' : 1, 'apc' : 0, 'final_version' : 1}, ignore_index=True)
-oas = oas.append({'id' : 5, 'status' : 'Gold', 'description' : 'Open Access journal (payment of APCs may apply). It allows offten the archive of published version on institutional repositories (embargo periods can apply)', 'subscription' : 0, 'accepted_manuscript' : 1, 'apc' : 1, 'final_version' : 1}, ignore_index=True)
-oas = oas.append({'id' : 6, 'status' : 'Diamond', 'description' : 'Open Access journal (without payment of APCs). It allows offten the archive of published version on institutional repositories (embargo periods can apply)', 'subscription' : 0, 'accepted_manuscript' : 1, 'apc' : 0, 'final_version' : 1}, ignore_index=True)
-
-
-# In[4]:
-
-
-oas
-
-
-# In[5]:
-
-
-# esport JSON
-result = oas.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/oa.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[6]:
-
-
-# export csv
-oas.to_csv('sample/oa.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[7]:
-
-
-# export excel
-oas.to_excel('sample/oa.xlsx', index=False)
-
-
-# ## Table Journals
-
-# In[8]:
-
-
-issns = pd.read_csv('issn/issns_count.tsv', encoding='utf-8', header=0, sep='\t')
-issns
-
-
-# In[9]:
-
-
-# ajout des colonnes
-issns.insert(0, 'id', '', False)
-issns
-
-
-# In[10]:
-
-
-# convertir l'index en id
-issns = issns.reset_index()
-issns
-
-
-# In[11]:
-
-
-# ajout de l'id avec l'index + 1
-issns['id'] = issns['index'] + 1
-del issns['index']
-issns
-
-
-# In[12]:
-
-
-# reduction à X journaux pour l'échantillon de test
-if journals_sample_n > 0 :
- issns = issns.loc[:journals_sample_n]
-issns
-
-
-# In[13]:
-
-
-# ajout des ISSN-L
-df_issnl = pd.read_csv('issn/20171102.ISSN-to-ISSN-L.txt', encoding='utf-8', header=0, sep='\t')
-df_issnl
-
-
-# In[14]:
-
-
-# renommer les colonnes
-df_issnl = df_issnl.rename(columns={'ISSN' : 'issn', 'ISSN-L' : 'issnl'})
-
-
-# In[15]:
-
-
-issns = pd.merge(issns, df_issnl, on='issn', how='left')
-issns
-
-
-# In[16]:
-
-
-# creation du DF
-# 'oa_status' supprimé pour le moment
-col_names = ['id',
- 'issn',
- 'issnl',
- 'title',
- 'starting_year',
- 'end_year',
- 'url',
- 'name_short_iso_4'
- ]
-journals = pd.DataFrame(columns = col_names)
-journals
-
-
-# In[17]:
-
-
-# creation du DF
-col_names = ['id', 'iso_code']
-journals_languages = pd.DataFrame(columns = col_names)
-journals_languages
-
-
-# In[18]:
-
-
-# creation du DF
-# 'oa_status' supprimé
-col_names = ['id', 'iso_code']
-journals_countries = pd.DataFrame(columns = col_names)
-journals_countries
-
-
-# In[19]:
-
-
-# extraction des informations à partir des données ISSN.org
-for index, row in issns.iterrows():
- myid = row['id']
- myissn = row['issn']
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- # initialisation des variables à extraire
- issnl = np.nan
- title = ''
- keytitle = ''
- starting_year = np.nan
- end_year = np.nan
- myurl = np.nan
- journal_country = np.nan
- journal_language = np.nan
- keytitle_abbr = np.nan
- # export en json
- if os.path.exists('issn/data/' + myissn + '.json'):
- with open('issn/data/' + myissn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- for x in data['@graph']:
- if ('@id' in x):
- if (x['@id'] == 'resource/ISSN/' + myissn):
- if ('mainTitle' in x):
- title = x['mainTitle']
- else :
- if ('name' in x):
- title = x['name']
- # print(myissn)
- if ('startDate' in x):
- starting_year = x['startDate']
- if ('endDate' in x):
- end_year = x['endDate']
- if ('url' in x):
- urls = x['url']
- if type(urls) is list:
- for url in urls:
- # Filtrer les URLs des archives :
- # www.ncbi.nlm.nih.gov/pmc/*
- # www.pubmedcentral.gov/*
- # pubmedcentral.nih.gov/*
- # bibpurl.oclc.org/*
- # www.jstor.org/*
- # ieeexplore.ieee.org
- # ovidsp.ovid.com
- # et garder le premier des restants
- myurl = url
- if ('ncbi.nlm.nih.gov' not in url
- and 'pubmedcentral' not in url
- and 'bibpurl.oclc.org' not in url
- and 'jstor.org' not in url
- and 'ieeexplore.ieee.org' not in url
- and 'ovidsp.ovid.com' not in url):
- break
- else :
- myurl = x['url']
- if ('spatial' in x):
- countries = x['spatial']
- if type(countries) is list:
- for country in countries:
- if ('https://www.iso.org/obp/ui/#iso:code:3166:' in country):
- journal_country = country[-2:]
- journals_countries = journals_countries.append({'id' : myid, 'iso_code' : journal_country}, ignore_index=True)
- else :
- if ('https://www.iso.org/obp/ui/#iso:code:3166:' in countries):
- journal_country = countries[-2:]
- journals_countries = journals_countries.append({'id' : myid, 'iso_code' : journal_country}, ignore_index=True)
- # langue "inLanguage": "http://id.loc.gov/vocabulary/iso639-2/eng",
- if ('inLanguage' in x):
- languages = x['inLanguage']
- if type(languages) is list:
- for language in languages:
- journal_language = language[-3:]
- journals_languages = journals_languages.append({'id' : myid, 'iso_code' : journal_language}, ignore_index=True)
- else :
- journal_language = languages[-3:]
- journals_languages = journals_languages.append({'id' : myid, 'iso_code' : journal_language}, ignore_index=True)
- if (x['@id'] == 'resource/ISSN/' + myissn + '#KeyTitle'):
- if ('value' in x):
- keytitle = x['value']
- if (x['@id'] == 'resource/ISSN/' + myissn + '#ISSN-L'):
- if ('value' in x):
- issnl = x['value']
- # "@id": "resource/ISSN/1098-0121#AbbreviatedKeyTitle",
- if (x['@id'] == 'resource/ISSN/' + myissn + '#AbbreviatedKeyTitle'):
- if ('value' in x):
- mykeytitle_abbrs = x['value']
- if type(mykeytitle_abbrs) is list:
- for mykeytitle_abbr in mykeytitle_abbrs:
- print(myissn + ' - AbbreviatedKeyTitle is a list ' + mykeytitle_abbr)
- keytitle_abbr = mykeytitle_abbr
- with open('sample/03_journals_issn_multiple_titles.txt', 'a', encoding='utf-8') as g:
- g.write(myissn + ' AbbreviatedKeyTitle is a list ' + mykeytitle_abbr + '\n')
- break
- else :
- keytitle_abbr = mykeytitle_abbrs
- if keytitle != '' :
- title = keytitle
- if title != '' :
- # supprimer le point à la fin
- if (title[-1] == '.'):
- title = title[0:-1]
- # remplacer les caractères spéciaux The
- if type(title) is list:
- for mytitlei in title:
- print(myissn + ' - title is a list ' + mytitlei)
- title = str.replace(mytitlei, 'The ', 'The ')
- with open('sample/03_journals_issn_multiple_titles.txt', 'a', encoding='utf-8') as g:
- g.write(myissn + ' title is a list ' + mytitlei + '\n')
- break
- else :
- title = str.replace(title, 'The ', 'The ')
- else :
- print(row['issn'] + ' - not found')
- with open('sample/03_journals_issn_errors.txt', 'a', encoding='utf-8') as g:
- g.write(row['issn'] + ' not found \n')
- journals.at[index,'id'] = myid
- journals.at[index,'title'] = title
- journals.at[index,'issn'] = myissn
- journals.at[index,'issnl'] = issnl
- journals.at[index,'starting_year'] = starting_year
- journals.at[index,'end_year'] = end_year
- journals.at[index,'url'] = myurl
- journals.at[index,'name_short_iso_4'] = keytitle_abbr
-
-
-# In[20]:
-
-
-journals
-
-
-# In[21]:
-
-
-# titres vides
-journals.loc[journals['title'] == '']
-
-
-# In[22]:
-
-
-# export csv des titres vides
-journals.loc[journals['title'] == ''].to_csv('sample/journals_sans_titre.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[23]:
-
-
-# export excel des ids
-journals.loc[journals['title'] == ''].to_excel('sample/journals_sans_titre.xlsx', index=False)
-
-
-# In[24]:
-
-
-# garder les lignes avec titre
-journals = journals.loc[journals['title'] != '']
-journals
-
-
-# In[25]:
-
-
-journals.shape[0]
-
-
-# ## Languages
-
-# In[26]:
-
-
-journals_languages
-
-
-# In[27]:
-
-
-# ouvrir la table des langues
-languages = pd.read_csv('sample/language.tsv', encoding='utf-8', header=0, sep='\t')
-languages
-
-
-# In[28]:
-
-
-# renommer les colonnes
-del languages['name']
-languages = languages.rename(columns={'id' : 'language'})
-
-
-# In[29]:
-
-
-# merge avec languages
-journals_languages = pd.merge(journals_languages, languages, on='iso_code', how='left')
-journals_languages
-
-
-# In[30]:
-
-
-# concat valeurs avec même id
-journals_languages['language'] = journals_languages['language'].astype(str)
-journals_languages = journals_languages.groupby('id').agg({'language': lambda x: ', '.join(x)})
-journals_languages
-
-
-# In[31]:
-
-
-# recuperation de l'id des langues
-journals = pd.merge(journals, journals_languages, on='id', how='left')
-journals
-
-
-# ## Countries
-
-# In[32]:
-
-
-journals_countries
-
-
-# In[33]:
-
-
-# ouvrir la table des pays
-country = pd.read_csv('sample/country.tsv', encoding='utf-8', header=0, sep='\t')
-country
-
-
-# In[34]:
-
-
-# renommer les colonnes
-del country['name']
-country = country.rename(columns={'id' : 'country'})
-
-
-# In[35]:
-
-
-# merge avec countries
-journals_countries = pd.merge(journals_countries, country, on='iso_code', how='left')
-journals_countries
-
-
-# In[36]:
-
-
-# concat valeurs avec même id
-journals_countries['country'] = journals_countries['country'].astype(str)
-journals_countries = journals_countries.groupby('id').agg({'country': lambda x: ', '.join(x)})
-journals_countries
-
-
-# In[37]:
-
-
-# recuperation de l'id des langues
-journals = pd.merge(journals, journals_countries, on='id', how='left')
-journals
-
-
-# ### DOAJ
-
-# In[38]:
-
-
-# ajout de DOAJ info
-doaj = pd.read_csv('doaj/journalcsv__doaj_20210312_0636_utf8.csv', encoding='utf-8', header=0)
-doaj
-
-
-# In[39]:
-
-
-# ajout ISSNL
-doaj['issn'] = doaj['Journal ISSN (print version)']
-doaj.loc[doaj['issn'].isna(), 'issn'] = doaj['Journal EISSN (online version)']
-doaj
-
-
-# In[40]:
-
-
-doaj = pd.merge(doaj, df_issnl, on='issn', how='left')
-doaj
-
-
-# In[41]:
-
-
-doaj.columns
-
-
-# In[42]:
-
-
-doaj['Preservation Services']
-
-
-# In[43]:
-
-
-doaj['DOAJ Seal']
-
-
-# In[44]:
-
-
-doaj['issnl']
-
-
-# In[45]:
-
-
-doaj['APC'].value_counts()
-
-
-# In[46]:
-
-
-# ajout des infos de DOAJ :
-# Journal title
-# DOAJ Seal
-doaj_for_merge = doaj[['issnl', 'Journal title', 'DOAJ Seal', 'APC']]
-doaj_for_merge
-
-
-# In[47]:
-
-
-# renommer les colonnes
-doaj_for_merge = doaj_for_merge.rename(columns={'Journal title' : 'doaj_title', 'DOAJ Seal' : 'doaj_seal'})
-doaj_for_merge
-
-
-# In[48]:
-
-
-# merge avec journals
-journals = pd.merge(journals, doaj_for_merge, on='issnl', how='left')
-journals
-
-
-# In[49]:
-
-
-# ajouter info sur la presence sur DOAJ ou du seal
-journals.loc[journals['doaj_title'].isna(), 'doaj_status'] = 0
-journals.loc[~journals['doaj_title'].isna(), 'doaj_status'] = 1
-journals.loc[journals['doaj_seal'] == 'Yes', 'doaj_seal'] = 1
-journals.loc[journals['doaj_seal'] == 'No', 'doaj_seal'] = 0
-journals
-
-
-# ### LOCKSS
-
-# In[50]:
-
-
-# ajout des infos de preservation LOCKSS, Portico et Licences Nationales
-lockss = pd.read_csv('lockss/keepers-LOCKSS-report.csv', encoding='utf-8', header=0, skiprows=1)
-lockss
-
-
-# In[51]:
-
-
-# ajout ISSNL
-lockss['issn'] = lockss['eISSN']
-lockss.loc[lockss['eISSN'].isna(), 'issn'] = lockss['ISSN']
-lockss
-
-
-# In[52]:
-
-
-lockss = pd.merge(lockss, df_issnl, on='issn', how='left')
-lockss
-
-
-# In[53]:
-
-
-lockss.columns
-
-
-# In[54]:
-
-
-# test des lignes sans merge
-lockss.loc[lockss['issnl'].isna()]
-
-
-# In[55]:
-
-
-# utiliser l'ISSN à la place sur ces lignes
-lockss.loc[lockss['issnl'].isna(), 'issnl'] = lockss['issn']
-
-
-# In[56]:
-
-
-# test des lignes sans merge
-lockss.loc[lockss['issnl'].isna()]
-
-
-# In[57]:
-
-
-# ajout des infos de LOCKSS :
-# Title
-lockss_for_merge = lockss[['issnl', 'Title']]
-lockss_for_merge
-
-
-# In[58]:
-
-
-# renommer les colonnes
-lockss_for_merge = lockss_for_merge.rename(columns={'Title' : 'lockss_title'})
-lockss_for_merge
-
-
-# In[59]:
-
-
-# merge avec journals
-journals = pd.merge(journals, lockss_for_merge, on='issnl', how='left')
-journals
-
-
-# In[60]:
-
-
-# suppression des doublons
-journals = journals.drop_duplicates(subset=['id'])
-journals
-
-
-# In[61]:
-
-
-# ajouter info sur la presence sur LOCKSS
-journals.loc[journals['lockss_title'].isna(), 'lockss'] = 0
-journals.loc[~journals['lockss_title'].isna(), 'lockss'] = 1
-journals
-
-
-# ### Portico
-
-# In[62]:
-
-
-# ajout des infos de preservation Portico
-portico = pd.read_excel('portico/e-journals.xlsx', sheet_name='Details', skiprows=2)
-portico
-
-
-# In[63]:
-
-
-# ajout ISSNL
-portico['issn'] = portico['e-ISSN']
-portico.loc[portico['e-ISSN'].isna(), 'issn'] = portico['Print ISSN']
-portico
-
-
-# In[64]:
-
-
-portico = pd.merge(portico, df_issnl, on='issn', how='left')
-portico
-
-
-# In[65]:
-
-
-portico.columns
-
-
-# In[66]:
-
-
-# test des lignes sans merge
-portico.loc[portico['issnl'].isna()]
-
-
-# In[67]:
-
-
-# utiliser l'ISSN à la place sur ces lignes
-portico.loc[portico['issnl'].isna(), 'issnl'] = portico['issn']
-
-
-# In[68]:
-
-
-# test des lignes sans merge
-portico.loc[portico['issnl'].isna()]
-
-
-# In[69]:
-
-
-# ajout des infos de Portico :
-# Status
-portico_for_merge = portico[['issnl', 'Status']]
-portico_for_merge
-
-
-# In[70]:
-
-
-# garder les lignes "preserved"
-portico_for_merge = portico_for_merge.loc[portico_for_merge['Status'] == 'preserved']
-portico_for_merge
-
-
-# In[71]:
-
-
-# renommer les colonnes
-portico_for_merge = portico_for_merge.rename(columns={'Status' : 'portico_status'})
-portico_for_merge
-
-
-# In[72]:
-
-
-# merge avec journals
-journals = pd.merge(journals, portico_for_merge, on='issnl', how='left')
-journals
-
-
-# In[73]:
-
-
-# suppression des doublons
-journals = journals.drop_duplicates(subset=['id'])
-journals
-
-
-# In[74]:
-
-
-# ajouter info sur la presence sur portico
-journals.loc[journals['portico_status'].isna(), 'portico'] = 0
-journals.loc[~journals['portico_status'].isna(), 'portico'] = 1
-journals
-
-
-# ### Licences Nationales
-
-# In[75]:
-
-
-# ajout des infos de preservation des Licences nationales
-nlch1 = pd.read_excel('licences_nationales/cambridge_Switzerland_NationalLicences_2020-08-17.xlsx')
-nlch1
-
-
-# In[76]:
-
-
-# ajout des infos de preservation des Licences nationales
-nlch2 = pd.read_excel('licences_nationales/gruyter_Switzerland_NationalLicences_2020-11-30.xlsx')
-nlch2
-
-
-# In[77]:
-
-
-# ajout des infos de preservation des Licences nationales
-nlch3 = pd.read_excel('licences_nationales/oxford_Switzerland_NationalLicences_2020-09-24.xlsx')
-nlch3
-
-
-# In[78]:
-
-
-# ajout des infos de preservation des Licences nationales
-nlch4 = pd.read_excel('licences_nationales/springer_Switzerland_NationalLicences_2020-08-12.xlsx')
-nlch4
-
-
-# In[79]:
-
-
-# concatener les 4
-nlch = pd.concat([nlch1, nlch2, nlch3, nlch4], ignore_index=True)
-nlch
-
-
-# In[80]:
-
-
-nlch.columns
-
-
-# In[81]:
-
-
-# ajout ISSNL
-nlch['issn'] = nlch['online_identifier']
-nlch.loc[nlch['online_identifier'].isna(), 'issn'] = nlch['print_identifier']
-nlch
-
-
-# In[82]:
-
-
-nlch = pd.merge(nlch, df_issnl, on='issn', how='left')
-nlch
-
-
-# In[83]:
-
-
-# test des lignes sans merge
-nlch.loc[nlch['issnl'].isna()]
-
-
-# In[84]:
-
-
-# utiliser l'ISSN à la place sur ces lignes
-nlch.loc[nlch['issnl'].isna(), 'issnl'] = nlch['issn']
-
-
-# In[85]:
-
-
-# test des lignes sans merge
-nlch.loc[nlch['issnl'].isna()]
-
-
-# In[86]:
-
-
-# ajout des infos de nlch :
-# publication_title
-nlch_for_merge = nlch[['issnl', 'publication_title']]
-nlch_for_merge
-
-
-# In[87]:
-
-
-# renommer les colonnes
-nlch_for_merge = nlch_for_merge.rename(columns={'publication_title' : 'nlch_title'})
-nlch_for_merge
-
-
-# In[88]:
-
-
-# merge avec journals
-journals = pd.merge(journals, nlch_for_merge, on='issnl', how='left')
-journals
-
-
-# In[89]:
-
-
-# ajouter info sur la presence sur portico
-journals.loc[journals['nlch_title'].isna(), 'nlch'] = 0
-journals.loc[~journals['nlch_title'].isna(), 'nlch'] = 1
-journals
-
-
-# ### QOAM
-
-# In[90]:
-
-
-# ouverture du fichier
-qoam = pd.read_csv('qoam/qoam_not_zero.tsv', encoding='utf-8', header=0, sep='\t')
-qoam
-
-
-# In[91]:
-
-
-qoam = pd.merge(qoam, df_issnl, on='issn', how='left')
-qoam
-
-
-# In[92]:
-
-
-# test des lignes sans merge
-qoam.loc[qoam['issnl'].isna()]
-
-
-# In[93]:
-
-
-# utiliser l'ISSN à la place sur ces lignes
-qoam.loc[qoam['issnl'].isna(), 'issnl'] = qoam['issn']
-
-
-# In[94]:
-
-
-# test des lignes sans merge
-qoam.loc[qoam['issnl'].isna()]
-
-
-# In[95]:
-
-
-# ajout des infos de qoam :
-# publication_title
-qoam_for_merge = qoam[['issnl', 'qoam_av_score']]
-qoam_for_merge
-
-
-# In[96]:
-
-
-# merge avec journals
-journals = pd.merge(journals, qoam_for_merge, on='issnl', how='left')
-journals
-
-
-# In[97]:
-
-
-# suppression des doublons
-journals = journals.drop_duplicates(subset=['id'])
-journals
-
-
-# ## Finalisation de la table journals
-
-# In[98]:
-
-
-# test des doublons
-journals_doublons = journals[['issn', 'issnl', 'title']].loc[journals.duplicated(subset='issnl')].sort_values(by='issnl')
-journals_doublons
-
-
-# In[99]:
-
-
-journals_doublons = journals_doublons.loc[journals_doublons['issnl'].notna()]
-
-
-# In[100]:
-
-
-# merge pour voir les lignes avec doublon
-journals_doublons['doublon_issnl'] = 1
-journals = pd.merge(journals, journals_doublons[['issnl', 'doublon_issnl']], on='issnl', how='left')
-journals.loc[journals['doublon_issnl'] == 1]
-
-
-# In[101]:
-
-
-journals.loc[journals['doublon_issnl'] == 1].sort_values(by='issnl')
-
-
-# In[102]:
-
-
-# export csv des doublons
-journals.loc[journals['doublon_issnl'] == 1].sort_values(by='issnl').to_csv('sample/journals_duplicates.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[103]:
-
-
-# export excel des doublons
-journals.loc[journals['doublon_issnl'] == 1].sort_values(by='issnl').to_excel('sample/journals_duplicates.xlsx', index=False)
-
-
-# In[104]:
-
-
-# suppression des doublons
-journals = journals.drop_duplicates(subset=['issnl'])
-journals
-
-
-# In[105]:
-
-
-# ajout du oa_status
-# 6 : Diamond
-# 5 : Gold
-# 4 : Full
-# 3 : Hybrid
-# 2 : Green
-# 1 : UNKNOWN
-journals['oa_status'] = 1
-journals
-
-
-# In[106]:
-
-
-# status 5 pour les revues DOAJ
-journals.loc[journals['doaj_status'] == 1, 'oa_status'] = 5
-# status 6 pour les revues DOAJ avec APC = 0
-journals.loc[(journals['doaj_status'] == 1) & (journals['APC'] == 'No'), 'oa_status'] = 6
-journals
-
-
-# In[107]:
-
-
-journals['oa_status'].value_counts()
-
-
-# In[108]:
-
-
-# export csv brut
-journals.to_csv('sample/journals_brut.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[109]:
-
-
-# export excel brut
-journals.to_excel('sample/journals_brut.xlsx', index=False)
-
-
-# In[110]:
-
-
-# export csv des ids
-journals[['id', 'title', 'issn', 'issnl']].to_csv('sample/journals_ids.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[111]:
-
-
-# export excel des ids
-journals[['id', 'title', 'issn', 'issnl']].to_excel('sample/journals_ids.xlsx', index=False)
-
-
-# In[ ]:
-
-
-
-
diff --git a/import_scripts/04_oacct_publishers.md b/import_scripts/04_oacct_publishers.md
deleted file mode 100644
index c855a57c..00000000
--- a/import_scripts/04_oacct_publishers.md
+++ /dev/null
@@ -1,2826 +0,0 @@
-# Projet Open Access Compliance Check Tool (OACCT)
-
-Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-
-Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-
-Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-Date de dernière mise à jour : 16.07.2021
-
-## Extraction des données des éditeurs
-
-Sources :
-1. Données de ISSN.org (JSON)
-
-### Format des données source
-
-* Noeud : "@graph"
-* spatial & publisher :
- * "@id": "resource/ISSN/0140-6736",
- * "spatial": [
- "http://id.loc.gov/vocabulary/countries/ne",
- "https://www.iso.org/obp/ui/#iso:code:3166:NL"
- ],
-
-Exemple avec plusieurs éditeurs dans le temps :
-
- "publisher": [
- "resource/ISSN/0140-6736#Publisher-Elsevier",
- "resource/ISSN/0140-6736#Publisher-J._Onwhyn"
- ],
-
- {
- "@id": "resource/ISSN/0140-6736#LatestPublicationEvent",
- "@type": "http://schema.org/PublicationEvent",
- "publishedBy": "resource/ISSN/0140-6736#Publisher-Elsevier",
- "location": "resource/ISSN/0140-6736#PublicationPlace-Amsterdam"
- },
-
- {
- "@id": "resource/ISSN/0140-6736#Publisher-Elsevier",
- "@type": "http://schema.org/Organization",
- "name": "Elsevier"
- },
-
-Exemple avec un seul éditeur dans le temps :
-
- "publisher": "resource/ISSN/0899-8418#Publisher-Wiley",
-
- {
- "@id": "resource/ISSN/0899-8418#EarliestPublicationEvent",
- "@type": "http://schema.org/PublicationEvent",
- "publishedBy": "resource/ISSN/0899-8418#Publisher-Wiley",
- "temporal": "c1989-",
- "location": [
- "resource/ISSN/0899-8418#PublicationPlace-New_York",
- "resource/ISSN/0899-8418#PublicationPlace-Chichester"
- ]
- },
-
- {
- "@id": "resource/ISSN/0899-8418#Publisher-Wiley",
- "@type": "http://schema.org/Organization",
- "name": "Wiley"
- },
-
-Exemple avec une liste d'éditeurs finaux :
-
- {
- "@id": "resource/ISSN/2174-8454#LatestPublicationEvent",
- "@type": "http://schema.org/PublicationEvent",
- "publishedBy": [
- "resource/ISSN/2174-8454#Publisher-The_Global_Studies_Institute_de_l’Université_de_Genève",
- "resource/ISSN/2174-8454#Publisher-Universitat_de_València,_Departamento_de_Teoría_de_los_Lenguajes_y_Ciencias_de_la_Comunicación"
- ],
- "location": "resource/ISSN/2174-8454#PublicationPlace-Valencia"
- },
-
-
-```python
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-```
-
-## Table Publishers
-
-
-```python
-# creation du DF
-# 'country' supprimé pour l'ajouter aux journaux
-# 'oa_status' supprimé pour le moment
-col_names = ['id',
- 'name',
- 'publisher_id_issn',
- ]
-publisher_issn = pd.DataFrame(columns = col_names)
-publisher_issn
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- publisher_id_issn
-
-
-
-
-
-
-
-
-
-## Table Journals
-
-
-```python
-journal = pd.read_csv('sample/journals_brut.tsv', encoding='utf-8', header=0, sep='\t')
-journal
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- ...
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
- doublon_issnl
- oa_status
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- ...
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- ...
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
- 0.0
- 4.035714
- NaN
- 5
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- ...
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 907
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 908
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- ...
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
- NaN
- 1
-
-
- 909
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- ...
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
- 910
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- ...
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
-
-
-
-
911 rows × 23 columns
-
-
-
-
-## Table Journals Publishers
-
-
-```python
-# creation du DF
-col_names = ['journal',
- 'publisher_id_issn'
- ]
-journal_publisher = pd.DataFrame(columns = col_names)
-journal_publisher
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_id_issn
-
-
-
-
-
-
-
-
-
-
-```python
-# extraction des informations à partir des données ISSN.org
-for index, row in journal.iterrows():
- journal_id = row['id']
- journal_issn = row['issn']
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- # initialisation des variables à extraire
- publisher_name = ''
- publisher_country = ''
- publisher_id = ''
- publisher_id_first = ''
- publisher_id_last = ''
- # export en json
- if os.path.exists('issn/data/' + journal_issn + '.json'):
- with open('issn/data/' + journal_issn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- for x in data['@graph']:
- if ('@id' in x):
- if (x['@id'] == 'resource/ISSN/' + journal_issn + '#LatestPublicationEvent'):
- if ('publishedBy' in x):
- publisher_id_last = x['publishedBy']
- elif (x['@id'] == 'resource/ISSN/' + journal_issn + '#EarliestPublicationEvent'):
- if ('publishedBy' in x):
- publisher_id_first = x['publishedBy']
- if (publisher_id_last != ''):
- publisher_id = publisher_id_last
- else :
- publisher_id = publisher_id_first
- if type(publisher_id) is list:
- for pid in publisher_id:
- if (pid != ''):
- for x in data['@graph']:
- if ('@id' in x):
- if (x['@id'] == pid):
- if ('name' in x):
- publisher_name = x['name']
- publisher_issn = publisher_issn.append({'publisher_id_issn' : pid, 'name' : publisher_name}, ignore_index=True)
- journal_publisher = journal_publisher.append({'journal' : journal_id, 'publisher_id_issn' : pid}, ignore_index=True)
- else :
- if (publisher_id != ''):
- for x in data['@graph']:
- if ('@id' in x):
- if (x['@id'] == publisher_id):
- if ('name' in x):
- publisher_name = x['name']
- publisher_issn = publisher_issn.append({'publisher_id_issn' : publisher_id, 'name' : publisher_name}, ignore_index=True)
- journal_publisher = journal_publisher.append({'journal' : journal_id, 'publisher_id_issn' : publisher_id}, ignore_index=True)
- else :
- print(row['issn'] + ' - pas trouvé')
-```
-
- 0
- 10
- 20
- 30
- 40
- 50
- 60
- 70
- 80
- 90
- 100
- 110
- 120
- 130
- 140
- 150
- 160
- 170
- 180
- 190
- 200
- 210
- 220
- 230
- 240
- 250
- 260
- 270
- 280
- 290
- 300
- 310
- 320
- 330
- 340
- 350
- 360
- 370
- 380
- 390
- 400
- 410
- 420
- 430
- 440
- 450
- 460
- 470
- 480
- 490
- 500
- 510
- 520
- 530
- 540
- 550
- 560
- 570
- 580
- 590
- 600
- 610
- 620
- 630
- 640
- 650
- 660
- 670
- 680
- 690
- 700
- 710
- 720
- 730
- 740
- 750
- 760
- 770
- 780
- 790
- 800
- 810
- 820
- 830
- 840
- 850
- 860
- 870
- 880
- 890
- 900
- 910
-
-
-
-```python
-publisher_issn
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- publisher_id_issn
-
-
-
-
- 0
- NaN
- Revue Médicale Suisse
- resource/ISSN/1660-9379#Publisher-Revue_Médica...
-
-
- 1
- NaN
- American Physical Society
- resource/ISSN/0031-9007#Publisher-American_Phy...
-
-
- 2
- NaN
- Public Library of Science
- resource/ISSN/1932-6203#Publisher-Public_Libra...
-
-
- 3
- NaN
- The Global Studies Institute de l’Université d...
- resource/ISSN/2174-8454#Publisher-The_Global_S...
-
-
- 4
- NaN
- Universitat de València, Departamento de Teorí...
- resource/ISSN/2174-8454#Publisher-Universitat_...
-
-
- ...
- ...
- ...
- ...
-
-
- 940
- NaN
- IOP Publishing
- resource/ISSN/0964-1726#Publisher-IOP_Publishing
-
-
- 941
- NaN
- Elsevier [etc.]
- resource/ISSN/0022-3468#Publisher-Elsevier_[etc.]
-
-
- 942
- NaN
- Springer
- resource/ISSN/1432-2064#Publisher-Springer
-
-
- 943
- NaN
- Pergamon
- resource/ISSN/0960-1481#Publisher-Pergamon
-
-
- 944
- NaN
- American Physiological Society
- resource/ISSN/0161-7567#Publisher-American_Phy...
-
-
-
-
945 rows × 3 columns
-
-
-
-
-
-```python
-# simlification des IDs
-publisher_issn[['publisher_id_racine', 'publisher_id_fin']] = publisher_issn['publisher_id_issn'].str.split('#Publisher-', n=1, expand=True)
-publisher_issn
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- publisher_id_issn
- publisher_id_racine
- publisher_id_fin
-
-
-
-
- 0
- NaN
- Revue Médicale Suisse
- resource/ISSN/1660-9379#Publisher-Revue_Médica...
- resource/ISSN/1660-9379
- Revue_Médicale_Suisse
-
-
- 1
- NaN
- American Physical Society
- resource/ISSN/0031-9007#Publisher-American_Phy...
- resource/ISSN/0031-9007
- American_Physical_Society
-
-
- 2
- NaN
- Public Library of Science
- resource/ISSN/1932-6203#Publisher-Public_Libra...
- resource/ISSN/1932-6203
- Public_Library_of_Science
-
-
- 3
- NaN
- The Global Studies Institute de l’Université d...
- resource/ISSN/2174-8454#Publisher-The_Global_S...
- resource/ISSN/2174-8454
- The_Global_Studies_Institute_de_l’Université_d...
-
-
- 4
- NaN
- Universitat de València, Departamento de Teorí...
- resource/ISSN/2174-8454#Publisher-Universitat_...
- resource/ISSN/2174-8454
- Universitat_de_València,_Departamento_de_Teorí...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 940
- NaN
- IOP Publishing
- resource/ISSN/0964-1726#Publisher-IOP_Publishing
- resource/ISSN/0964-1726
- IOP_Publishing
-
-
- 941
- NaN
- Elsevier [etc.]
- resource/ISSN/0022-3468#Publisher-Elsevier_[etc.]
- resource/ISSN/0022-3468
- Elsevier_[etc.]
-
-
- 942
- NaN
- Springer
- resource/ISSN/1432-2064#Publisher-Springer
- resource/ISSN/1432-2064
- Springer
-
-
- 943
- NaN
- Pergamon
- resource/ISSN/0960-1481#Publisher-Pergamon
- resource/ISSN/0960-1481
- Pergamon
-
-
- 944
- NaN
- American Physiological Society
- resource/ISSN/0161-7567#Publisher-American_Phy...
- resource/ISSN/0161-7567
- American_Physiological_Society
-
-
-
-
945 rows × 5 columns
-
-
-
-
-
-```python
-# simplifications
-del publisher_issn['publisher_id_issn']
-del publisher_issn['publisher_id_racine']
-del publisher_issn['id']
-publisher_issn = publisher_issn.rename(columns={'publisher_id_fin': 'publisher_id_issn'})
-publisher_issn
-```
-
-
-
-
-
-
-
-
-
-
- name
- publisher_id_issn
-
-
-
-
- 0
- Revue Médicale Suisse
- Revue_Médicale_Suisse
-
-
- 1
- American Physical Society
- American_Physical_Society
-
-
- 2
- Public Library of Science
- Public_Library_of_Science
-
-
- 3
- The Global Studies Institute de l’Université d...
- The_Global_Studies_Institute_de_l’Université_d...
-
-
- 4
- Universitat de València, Departamento de Teorí...
- Universitat_de_València,_Departamento_de_Teorí...
-
-
- ...
- ...
- ...
-
-
- 940
- IOP Publishing
- IOP_Publishing
-
-
- 941
- Elsevier [etc.]
- Elsevier_[etc.]
-
-
- 942
- Springer
- Springer
-
-
- 943
- Pergamon
- Pergamon
-
-
- 944
- American Physiological Society
- American_Physiological_Society
-
-
-
-
945 rows × 2 columns
-
-
-
-
-
-```python
-# supprimer les crochets et supprimer les doublons
-# publisher['publisher_id'] = publisher['publisher_id'].str.replace('[', '')
-# publisher['publisher_id'] = publisher['publisher_id'].str.replace(']', '')
-# publisher['name'] = publisher['name'].str.replace('[', '')
-# publisher['name'] = publisher['name'].str.replace(']', '')
-publisher_issn = publisher_issn.drop_duplicates(subset=['publisher_id_issn'])
-publisher_issn
-```
-
-
-
-
-
-
-
-
-
-
- name
- publisher_id_issn
-
-
-
-
- 0
- Revue Médicale Suisse
- Revue_Médicale_Suisse
-
-
- 1
- American Physical Society
- American_Physical_Society
-
-
- 2
- Public Library of Science
- Public_Library_of_Science
-
-
- 3
- The Global Studies Institute de l’Université d...
- The_Global_Studies_Institute_de_l’Université_d...
-
-
- 4
- Universitat de València, Departamento de Teorí...
- Universitat_de_València,_Departamento_de_Teorí...
-
-
- ...
- ...
- ...
-
-
- 929
- Fisher
- Fisher
-
-
- 930
- Tipografia La Commerciale
- Tipografia_La_Commerciale
-
-
- 932
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...
-
-
- 934
- Excerpta Medica
- Excerpta_Medica
-
-
- 937
- Generative Grammar Group of the Department of ...
- Generative_Grammar_Group_of_the_Department_of_...
-
-
-
-
380 rows × 2 columns
-
-
-
-
-
-```python
-# test publishers sans nom
-publisher_issn.loc[publisher_issn['name'] == '']
-```
-
-
-
-
-
-
-
-
-
-
- name
- publisher_id_issn
-
-
-
-
- 241
-
- None
-
-
-
-
-
-
-
-
-```python
-journal_publisher
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_id_issn
-
-
-
-
- 0
- 1
- resource/ISSN/1660-9379#Publisher-Revue_Médica...
-
-
- 1
- 2
- resource/ISSN/0031-9007#Publisher-American_Phy...
-
-
- 2
- 3
- resource/ISSN/1932-6203#Publisher-Public_Libra...
-
-
- 3
- 4
- resource/ISSN/2174-8454#Publisher-The_Global_S...
-
-
- 4
- 4
- resource/ISSN/2174-8454#Publisher-Universitat_...
-
-
- ...
- ...
- ...
-
-
- 940
- 997
- resource/ISSN/0964-1726#Publisher-IOP_Publishing
-
-
- 941
- 998
- resource/ISSN/0022-3468#Publisher-Elsevier_[etc.]
-
-
- 942
- 999
- resource/ISSN/1432-2064#Publisher-Springer
-
-
- 943
- 1000
- resource/ISSN/0960-1481#Publisher-Pergamon
-
-
- 944
- 1001
- resource/ISSN/0161-7567#Publisher-American_Phy...
-
-
-
-
945 rows × 2 columns
-
-
-
-
-
-```python
-# simlification des IDs
-journal_publisher[['publisher_id_racine', 'publisher_id_fin']] = journal_publisher['publisher_id_issn'].str.split('#Publisher-', n=1, expand=True)
-# simplifications
-del journal_publisher['publisher_id_issn']
-del journal_publisher['publisher_id_racine']
-journal_publisher = journal_publisher.rename(columns={'publisher_id_fin': 'publisher_id_issn'})
-journal_publisher
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_id_issn
-
-
-
-
- 0
- 1
- Revue_Médicale_Suisse
-
-
- 1
- 2
- American_Physical_Society
-
-
- 2
- 3
- Public_Library_of_Science
-
-
- 3
- 4
- The_Global_Studies_Institute_de_l’Université_d...
-
-
- 4
- 4
- Universitat_de_València,_Departamento_de_Teorí...
-
-
- ...
- ...
- ...
-
-
- 940
- 997
- IOP_Publishing
-
-
- 941
- 998
- Elsevier_[etc.]
-
-
- 942
- 999
- Springer
-
-
- 943
- 1000
- Pergamon
-
-
- 944
- 1001
- American_Physiological_Society
-
-
-
-
945 rows × 2 columns
-
-
-
-
-
-```python
-# merge avec journals
-journal_publisher = pd.merge(journal_publisher, publisher_issn, on='publisher_id_issn', how='left')
-journal_publisher
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_id_issn
- name
-
-
-
-
- 0
- 1
- Revue_Médicale_Suisse
- Revue Médicale Suisse
-
-
- 1
- 2
- American_Physical_Society
- American Physical Society
-
-
- 2
- 3
- Public_Library_of_Science
- Public Library of Science
-
-
- 3
- 4
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
-
-
- 4
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
-
-
- ...
- ...
- ...
- ...
-
-
- 940
- 997
- IOP_Publishing
- IOP Publishing
-
-
- 941
- 998
- Elsevier_[etc.]
- Elsevier [etc.]
-
-
- 942
- 999
- Springer
- Springer
-
-
- 943
- 1000
- Pergamon
- Pergamon
-
-
- 944
- 1001
- American_Physiological_Society
- American Physiological Society
-
-
-
-
945 rows × 3 columns
-
-
-
-
-
-```python
-journal_publisher = journal_publisher.rename(columns={'publisher_id_issn': 'publisher_id'})
-journal_publisher
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_id
- name
-
-
-
-
- 0
- 1
- Revue_Médicale_Suisse
- Revue Médicale Suisse
-
-
- 1
- 2
- American_Physical_Society
- American Physical Society
-
-
- 2
- 3
- Public_Library_of_Science
- Public Library of Science
-
-
- 3
- 4
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
-
-
- 4
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
-
-
- ...
- ...
- ...
- ...
-
-
- 940
- 997
- IOP_Publishing
- IOP Publishing
-
-
- 941
- 998
- Elsevier_[etc.]
- Elsevier [etc.]
-
-
- 942
- 999
- Springer
- Springer
-
-
- 943
- 1000
- Pergamon
- Pergamon
-
-
- 944
- 1001
- American_Physiological_Society
- American Physiological Society
-
-
-
-
945 rows × 3 columns
-
-
-
-
-
-```python
-publisher = journal_publisher[['publisher_id', 'name']]
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- publisher_id
- name
-
-
-
-
- 0
- Revue_Médicale_Suisse
- Revue Médicale Suisse
-
-
- 1
- American_Physical_Society
- American Physical Society
-
-
- 2
- Public_Library_of_Science
- Public Library of Science
-
-
- 3
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
-
-
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
-
-
- ...
- ...
- ...
-
-
- 940
- IOP_Publishing
- IOP Publishing
-
-
- 941
- Elsevier_[etc.]
- Elsevier [etc.]
-
-
- 942
- Springer
- Springer
-
-
- 943
- Pergamon
- Pergamon
-
-
- 944
- American_Physiological_Society
- American Physiological Society
-
-
-
-
945 rows × 2 columns
-
-
-
-
-
-```python
-# supprimer les doublons
-publisher = publisher.drop_duplicates(subset='publisher_id')
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- publisher_id
- name
-
-
-
-
- 0
- Revue_Médicale_Suisse
- Revue Médicale Suisse
-
-
- 1
- American_Physical_Society
- American Physical Society
-
-
- 2
- Public_Library_of_Science
- Public Library of Science
-
-
- 3
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
-
-
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
-
-
- ...
- ...
- ...
-
-
- 929
- Fisher
- Fisher
-
-
- 930
- Tipografia_La_Commerciale
- Tipografia La Commerciale
-
-
- 932
- Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
-
-
- 934
- Excerpta_Medica
- Excerpta Medica
-
-
- 937
- Generative_Grammar_Group_of_the_Department_of_...
- Generative Grammar Group of the Department of ...
-
-
-
-
380 rows × 2 columns
-
-
-
-
-
-```python
-# convertir l'index en id
-publisher = publisher.reset_index()
-# ajout de l'id avec l'index + 1
-publisher['id'] = publisher['index'] + 1
-del publisher['index']
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- publisher_id
- name
- id
-
-
-
-
- 0
- Revue_Médicale_Suisse
- Revue Médicale Suisse
- 1
-
-
- 1
- American_Physical_Society
- American Physical Society
- 2
-
-
- 2
- Public_Library_of_Science
- Public Library of Science
- 3
-
-
- 3
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
- 4
-
-
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 375
- Fisher
- Fisher
- 930
-
-
- 376
- Tipografia_La_Commerciale
- Tipografia La Commerciale
- 931
-
-
- 377
- Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 933
-
-
- 378
- Excerpta_Medica
- Excerpta Medica
- 935
-
-
- 379
- Generative_Grammar_Group_of_the_Department_of_...
- Generative Grammar Group of the Department of ...
- 938
-
-
-
-
380 rows × 3 columns
-
-
-
-
-
-```python
-# convertir l'index en id
-publisher = publisher.reset_index()
-# ajout de l'id avec l'index + 1
-publisher['id'] = publisher['index'] + 1
-del publisher['index']
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- publisher_id
- name
- id
-
-
-
-
- 0
- Revue_Médicale_Suisse
- Revue Médicale Suisse
- 1
-
-
- 1
- American_Physical_Society
- American Physical Society
- 2
-
-
- 2
- Public_Library_of_Science
- Public Library of Science
- 3
-
-
- 3
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
- 4
-
-
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 375
- Fisher
- Fisher
- 376
-
-
- 376
- Tipografia_La_Commerciale
- Tipografia La Commerciale
- 377
-
-
- 377
- Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 378
-
-
- 378
- Excerpta_Medica
- Excerpta Medica
- 379
-
-
- 379
- Generative_Grammar_Group_of_the_Department_of_...
- Generative Grammar Group of the Department of ...
- 380
-
-
-
-
380 rows × 3 columns
-
-
-
-
-
-```python
-# ajout de la valeur UNKNOWN
-# 'country': 999999
-publisher = publisher.append({'id' : 999999, 'name' : 'UNKNOWN', 'publisher_id': '999999'}, ignore_index=True)
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- publisher_id
- name
- id
-
-
-
-
- 0
- Revue_Médicale_Suisse
- Revue Médicale Suisse
- 1
-
-
- 1
- American_Physical_Society
- American Physical Society
- 2
-
-
- 2
- Public_Library_of_Science
- Public Library of Science
- 3
-
-
- 3
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
- 4
-
-
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 376
- Tipografia_La_Commerciale
- Tipografia La Commerciale
- 377
-
-
- 377
- Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 378
-
-
- 378
- Excerpta_Medica
- Excerpta Medica
- 379
-
-
- 379
- Generative_Grammar_Group_of_the_Department_of_...
- Generative Grammar Group of the Department of ...
- 380
-
-
- 380
- 999999
- UNKNOWN
- 999999
-
-
-
-
381 rows × 3 columns
-
-
-
-
-
-```python
-# recuperation de l'id du publisher
-journal_publisher = pd.merge(journal_publisher, publisher[['publisher_id', 'id']], on='publisher_id', how='left')
-journal_publisher
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_id
- name
- id
-
-
-
-
- 0
- 1
- Revue_Médicale_Suisse
- Revue Médicale Suisse
- 1
-
-
- 1
- 2
- American_Physical_Society
- American Physical Society
- 2
-
-
- 2
- 3
- Public_Library_of_Science
- Public Library of Science
- 3
-
-
- 3
- 4
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
- 4
-
-
- 4
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
- 5
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 940
- 997
- IOP_Publishing
- IOP Publishing
- 47
-
-
- 941
- 998
- Elsevier_[etc.]
- Elsevier [etc.]
- 75
-
-
- 942
- 999
- Springer
- Springer
- 8
-
-
- 943
- 1000
- Pergamon
- Pergamon
- 119
-
-
- 944
- 1001
- American_Physiological_Society
- American Physiological Society
- 217
-
-
-
-
945 rows × 4 columns
-
-
-
-
-
-```python
-journal_publisher = journal_publisher.rename(columns={'id': 'publisher'})
-journal_publisher
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_id
- name
- publisher
-
-
-
-
- 0
- 1
- Revue_Médicale_Suisse
- Revue Médicale Suisse
- 1
-
-
- 1
- 2
- American_Physical_Society
- American Physical Society
- 2
-
-
- 2
- 3
- Public_Library_of_Science
- Public Library of Science
- 3
-
-
- 3
- 4
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
- 4
-
-
- 4
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
- 5
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 940
- 997
- IOP_Publishing
- IOP Publishing
- 47
-
-
- 941
- 998
- Elsevier_[etc.]
- Elsevier [etc.]
- 75
-
-
- 942
- 999
- Springer
- Springer
- 8
-
-
- 943
- 1000
- Pergamon
- Pergamon
- 119
-
-
- 944
- 1001
- American_Physiological_Society
- American Physiological Society
- 217
-
-
-
-
945 rows × 4 columns
-
-
-
-
-
-```python
-# ajout du publisher id au journals_brut
-journal_publisher_ids = journal_publisher[['journal', 'publisher']]
-journal_publisher_ids = journal_publisher_ids.rename(columns={'journal': 'id'})
-journal_publisher_ids['publisher'] = journal_publisher_ids['publisher'].astype(str)
-journal_publisher_ids
-```
-
-
-
-
-
-
-
-
-
-
- id
- publisher
-
-
-
-
- 0
- 1
- 1
-
-
- 1
- 2
- 2
-
-
- 2
- 3
- 3
-
-
- 3
- 4
- 4
-
-
- 4
- 4
- 5
-
-
- ...
- ...
- ...
-
-
- 940
- 997
- 47
-
-
- 941
- 998
- 75
-
-
- 942
- 999
- 8
-
-
- 943
- 1000
- 119
-
-
- 944
- 1001
- 217
-
-
-
-
945 rows × 2 columns
-
-
-
-
-
-```python
-# concat valeurs avec même id
-journal_publisher_grouped = journal_publisher_ids.groupby('id').agg({'publisher': lambda x: ', '.join(x)})
-journal_publisher_grouped
-```
-
-
-
-
-
-
-
-
-
-
- publisher
-
-
- id
-
-
-
-
-
- 1
- 1
-
-
- 2
- 2
-
-
- 3
- 3
-
-
- 4
- 4, 5
-
-
- 5
- 6
-
-
- ...
- ...
-
-
- 997
- 47
-
-
- 998
- 75
-
-
- 999
- 8
-
-
- 1000
- 119
-
-
- 1001
- 217
-
-
-
-
911 rows × 1 columns
-
-
-
-
-
-```python
-# recuperation de l'id du publisher
-journals = pd.merge(journal, journal_publisher_grouped, on='id', how='left')
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- ...
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
- doublon_issnl
- oa_status
- publisher
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- ...
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 1
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- ...
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
- 2
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- ...
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
- 0.0
- 4.035714
- NaN
- 5
- 3
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- ...
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 4, 5
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- ...
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
- 6
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- ...
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 47
-
-
- 907
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- ...
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 75
-
-
- 908
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- ...
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
- NaN
- 1
- 8
-
-
- 909
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- ...
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 119
-
-
- 910
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- ...
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 217
-
-
-
-
911 rows × 24 columns
-
-
-
-
-
-```python
-# export csv
-publisher.to_csv('sample/publishers_brut.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-publisher.to_excel('sample/publishers_brut.xlsx', index=False)
-```
-
-
-```python
-# export csv brut des journals
-journals.to_csv('sample/journals_publishers_brut.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel brut
-journals.to_excel('sample/journals_publishers_brut.xlsx', index=False)
-```
-
-
-```python
-# export csv brut des ids
-journal_publisher_ids.to_csv('sample/journals_publishers_ids.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel brut des ids
-journal_publisher_ids.to_excel('sample/journals_publishers_ids.xlsx', index=False)
-```
diff --git a/import_scripts/04_oacct_publishers.py b/import_scripts/04_oacct_publishers.py
deleted file mode 100644
index d18e1e59..00000000
--- a/import_scripts/04_oacct_publishers.py
+++ /dev/null
@@ -1,387 +0,0 @@
-#!/usr/bin/env python
-# coding: utf-8
-
-# # Projet Open Access Compliance Check Tool (OACCT)
-#
-# Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-#
-# Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-#
-# Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-# Date de dernière mise à jour : 16.07.2021
-
-# ## Extraction des données des éditeurs
-#
-# Sources :
-# 1. Données de ISSN.org (JSON)
-#
-# ### Format des données source
-#
-# * Noeud : "@graph"
-# * spatial & publisher :
-# * "@id": "resource/ISSN/0140-6736",
-# * "spatial": [
-# "http://id.loc.gov/vocabulary/countries/ne",
-# "https://www.iso.org/obp/ui/#iso:code:3166:NL"
-# ],
-#
-# Exemple avec plusieurs éditeurs dans le temps :
-#
-# "publisher": [
-# "resource/ISSN/0140-6736#Publisher-Elsevier",
-# "resource/ISSN/0140-6736#Publisher-J._Onwhyn"
-# ],
-#
-# {
-# "@id": "resource/ISSN/0140-6736#LatestPublicationEvent",
-# "@type": "http://schema.org/PublicationEvent",
-# "publishedBy": "resource/ISSN/0140-6736#Publisher-Elsevier",
-# "location": "resource/ISSN/0140-6736#PublicationPlace-Amsterdam"
-# },
-#
-# {
-# "@id": "resource/ISSN/0140-6736#Publisher-Elsevier",
-# "@type": "http://schema.org/Organization",
-# "name": "Elsevier"
-# },
-#
-# Exemple avec un seul éditeur dans le temps :
-#
-# "publisher": "resource/ISSN/0899-8418#Publisher-Wiley",
-#
-# {
-# "@id": "resource/ISSN/0899-8418#EarliestPublicationEvent",
-# "@type": "http://schema.org/PublicationEvent",
-# "publishedBy": "resource/ISSN/0899-8418#Publisher-Wiley",
-# "temporal": "c1989-",
-# "location": [
-# "resource/ISSN/0899-8418#PublicationPlace-New_York",
-# "resource/ISSN/0899-8418#PublicationPlace-Chichester"
-# ]
-# },
-#
-# {
-# "@id": "resource/ISSN/0899-8418#Publisher-Wiley",
-# "@type": "http://schema.org/Organization",
-# "name": "Wiley"
-# },
-#
-# Exemple avec une liste d'éditeurs finaux :
-#
-# {
-# "@id": "resource/ISSN/2174-8454#LatestPublicationEvent",
-# "@type": "http://schema.org/PublicationEvent",
-# "publishedBy": [
-# "resource/ISSN/2174-8454#Publisher-The_Global_Studies_Institute_de_l’Université_de_Genève",
-# "resource/ISSN/2174-8454#Publisher-Universitat_de_València,_Departamento_de_Teoría_de_los_Lenguajes_y_Ciencias_de_la_Comunicación"
-# ],
-# "location": "resource/ISSN/2174-8454#PublicationPlace-Valencia"
-# },
-
-# In[1]:
-
-
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-
-
-# ## Table Publishers
-
-# In[2]:
-
-
-# creation du DF
-# 'country' supprimé pour l'ajouter aux journaux
-# 'oa_status' supprimé pour le moment
-col_names = ['id',
- 'name',
- 'publisher_id_issn',
- ]
-publisher_issn = pd.DataFrame(columns = col_names)
-publisher_issn
-
-
-# ## Table Journals
-
-# In[3]:
-
-
-journal = pd.read_csv('sample/journals_brut.tsv', encoding='utf-8', header=0, sep='\t')
-journal
-
-
-# ## Table Journals Publishers
-
-# In[4]:
-
-
-# creation du DF
-col_names = ['journal',
- 'publisher_id_issn'
- ]
-journal_publisher = pd.DataFrame(columns = col_names)
-journal_publisher
-
-
-# In[5]:
-
-
-# extraction des informations à partir des données ISSN.org
-for index, row in journal.iterrows():
- journal_id = row['id']
- journal_issn = row['issn']
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- # initialisation des variables à extraire
- publisher_name = ''
- publisher_country = ''
- publisher_id = ''
- publisher_id_first = ''
- publisher_id_last = ''
- # export en json
- if os.path.exists('issn/data/' + journal_issn + '.json'):
- with open('issn/data/' + journal_issn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- for x in data['@graph']:
- if ('@id' in x):
- if (x['@id'] == 'resource/ISSN/' + journal_issn + '#LatestPublicationEvent'):
- if ('publishedBy' in x):
- publisher_id_last = x['publishedBy']
- elif (x['@id'] == 'resource/ISSN/' + journal_issn + '#EarliestPublicationEvent'):
- if ('publishedBy' in x):
- publisher_id_first = x['publishedBy']
- if (publisher_id_last != ''):
- publisher_id = publisher_id_last
- else :
- publisher_id = publisher_id_first
- if type(publisher_id) is list:
- for pid in publisher_id:
- if (pid != ''):
- for x in data['@graph']:
- if ('@id' in x):
- if (x['@id'] == pid):
- if ('name' in x):
- publisher_name = x['name']
- publisher_issn = publisher_issn.append({'publisher_id_issn' : pid, 'name' : publisher_name}, ignore_index=True)
- journal_publisher = journal_publisher.append({'journal' : journal_id, 'publisher_id_issn' : pid}, ignore_index=True)
- else :
- if (publisher_id != ''):
- for x in data['@graph']:
- if ('@id' in x):
- if (x['@id'] == publisher_id):
- if ('name' in x):
- publisher_name = x['name']
- publisher_issn = publisher_issn.append({'publisher_id_issn' : publisher_id, 'name' : publisher_name}, ignore_index=True)
- journal_publisher = journal_publisher.append({'journal' : journal_id, 'publisher_id_issn' : publisher_id}, ignore_index=True)
- else :
- print(row['issn'] + ' - pas trouvé')
-
-
-# In[6]:
-
-
-publisher_issn
-
-
-# In[7]:
-
-
-# simlification des IDs
-publisher_issn[['publisher_id_racine', 'publisher_id_fin']] = publisher_issn['publisher_id_issn'].str.split('#Publisher-', n=1, expand=True)
-publisher_issn
-
-
-# In[8]:
-
-
-# simplifications
-del publisher_issn['publisher_id_issn']
-del publisher_issn['publisher_id_racine']
-del publisher_issn['id']
-publisher_issn = publisher_issn.rename(columns={'publisher_id_fin': 'publisher_id_issn'})
-publisher_issn
-
-
-# In[9]:
-
-
-# supprimer les crochets et supprimer les doublons
-# publisher['publisher_id'] = publisher['publisher_id'].str.replace('[', '')
-# publisher['publisher_id'] = publisher['publisher_id'].str.replace(']', '')
-# publisher['name'] = publisher['name'].str.replace('[', '')
-# publisher['name'] = publisher['name'].str.replace(']', '')
-publisher_issn = publisher_issn.drop_duplicates(subset=['publisher_id_issn'])
-publisher_issn
-
-
-# In[10]:
-
-
-# test publishers sans nom
-publisher_issn.loc[publisher_issn['name'] == '']
-
-
-# In[11]:
-
-
-journal_publisher
-
-
-# In[12]:
-
-
-# simlification des IDs
-journal_publisher[['publisher_id_racine', 'publisher_id_fin']] = journal_publisher['publisher_id_issn'].str.split('#Publisher-', n=1, expand=True)
-# simplifications
-del journal_publisher['publisher_id_issn']
-del journal_publisher['publisher_id_racine']
-journal_publisher = journal_publisher.rename(columns={'publisher_id_fin': 'publisher_id_issn'})
-journal_publisher
-
-
-# In[13]:
-
-
-# merge avec journals
-journal_publisher = pd.merge(journal_publisher, publisher_issn, on='publisher_id_issn', how='left')
-journal_publisher
-
-
-# In[14]:
-
-
-journal_publisher = journal_publisher.rename(columns={'publisher_id_issn': 'publisher_id'})
-journal_publisher
-
-
-# In[15]:
-
-
-publisher = journal_publisher[['publisher_id', 'name']]
-publisher
-
-
-# In[16]:
-
-
-# supprimer les doublons
-publisher = publisher.drop_duplicates(subset='publisher_id')
-publisher
-
-
-# In[17]:
-
-
-# convertir l'index en id
-publisher = publisher.reset_index()
-# ajout de l'id avec l'index + 1
-publisher['id'] = publisher['index'] + 1
-del publisher['index']
-publisher
-
-
-# In[18]:
-
-
-# convertir l'index en id
-publisher = publisher.reset_index()
-# ajout de l'id avec l'index + 1
-publisher['id'] = publisher['index'] + 1
-del publisher['index']
-publisher
-
-
-# In[19]:
-
-
-# ajout de la valeur UNKNOWN
-# 'country': 999999
-publisher = publisher.append({'id' : 999999, 'name' : 'UNKNOWN', 'publisher_id': '999999'}, ignore_index=True)
-publisher
-
-
-# In[20]:
-
-
-# recuperation de l'id du publisher
-journal_publisher = pd.merge(journal_publisher, publisher[['publisher_id', 'id']], on='publisher_id', how='left')
-journal_publisher
-
-
-# In[21]:
-
-
-journal_publisher = journal_publisher.rename(columns={'id': 'publisher'})
-journal_publisher
-
-
-# In[22]:
-
-
-# ajout du publisher id au journals_brut
-journal_publisher_ids = journal_publisher[['journal', 'publisher']]
-journal_publisher_ids = journal_publisher_ids.rename(columns={'journal': 'id'})
-journal_publisher_ids['publisher'] = journal_publisher_ids['publisher'].astype(str)
-journal_publisher_ids
-
-
-# In[23]:
-
-
-# concat valeurs avec même id
-journal_publisher_grouped = journal_publisher_ids.groupby('id').agg({'publisher': lambda x: ', '.join(x)})
-journal_publisher_grouped
-
-
-# In[24]:
-
-
-# recuperation de l'id du publisher
-journals = pd.merge(journal, journal_publisher_grouped, on='id', how='left')
-journals
-
-
-# In[25]:
-
-
-# export csv
-publisher.to_csv('sample/publishers_brut.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[26]:
-
-
-# export excel
-publisher.to_excel('sample/publishers_brut.xlsx', index=False)
-
-
-# In[27]:
-
-
-# export csv brut des journals
-journals.to_csv('sample/journals_publishers_brut.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[28]:
-
-
-# export excel brut
-journals.to_excel('sample/journals_publishers_brut.xlsx', index=False)
-
-
-# In[29]:
-
-
-# export csv brut des ids
-journal_publisher_ids.to_csv('sample/journals_publishers_ids.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[30]:
-
-
-# export excel brut des ids
-journal_publisher_ids.to_excel('sample/journals_publishers_ids.xlsx', index=False)
-
diff --git a/import_scripts/05_oacct_issns.md b/import_scripts/05_oacct_issns.md
deleted file mode 100644
index 39a18fd7..00000000
--- a/import_scripts/05_oacct_issns.md
+++ /dev/null
@@ -1,2109 +0,0 @@
-# Projet Open Access Compliance Check Tool (OACCT)
-
-Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-
-Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-
-Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-Date de dernière mise à jour : 16.07.2021
-
-## Table ISSNs
-
-
-```python
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-```
-
-
-```python
-# ajout des ISSN-L
-issns = pd.read_csv('issn/20171102.ISSN-to-ISSN-L.txt', encoding='utf-8', header=0, sep='\t')
-issns
-```
-
-
-
-
-
-
-
-
-
-
- ISSN
- ISSN-L
-
-
-
-
- 0
- 0000-0019
- 0000-0019
-
-
- 1
- 0000-0027
- 0000-0027
-
-
- 2
- 0000-0043
- 0000-0043
-
-
- 3
- 0000-0051
- 0000-0051
-
-
- 4
- 0000-006X
- 0000-006X
-
-
- ...
- ...
- ...
-
-
- 1995913
- 8756-9957
- 8756-9957
-
-
- 1995914
- 8756-9965
- 8756-9965
-
-
- 1995915
- 8756-9973
- 8756-9973
-
-
- 1995916
- 8756-9981
- 8756-9981
-
-
- 1995917
- 8756-999X
- 8756-999X
-
-
-
-
1995918 rows × 2 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-issns = issns.rename(columns={'ISSN' : 'issn', 'ISSN-L' : 'issnl'})
-issns
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
-
-
-
-
- 0
- 0000-0019
- 0000-0019
-
-
- 1
- 0000-0027
- 0000-0027
-
-
- 2
- 0000-0043
- 0000-0043
-
-
- 3
- 0000-0051
- 0000-0051
-
-
- 4
- 0000-006X
- 0000-006X
-
-
- ...
- ...
- ...
-
-
- 1995913
- 8756-9957
- 8756-9957
-
-
- 1995914
- 8756-9965
- 8756-9965
-
-
- 1995915
- 8756-9973
- 8756-9973
-
-
- 1995916
- 8756-9981
- 8756-9981
-
-
- 1995917
- 8756-999X
- 8756-999X
-
-
-
-
1995918 rows × 2 columns
-
-
-
-
-
-```python
-journals = pd.read_csv('sample/journals_brut.tsv', encoding='utf-8', sep='\t', usecols=(['id', 'issn', 'issnl']))
-journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
-
-
- 1
- 2
- 0031-9007
- 0031-9007
-
-
- 2
- 3
- 1932-6203
- 1932-6203
-
-
- 3
- 4
- 2174-8454
- 2174-8454
-
-
- 4
- 5
- 1098-0121
- 1098-0121
-
-
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- 0964-1726
- 0964-1726
-
-
- 907
- 998
- 0022-3468
- 0022-3468
-
-
- 908
- 999
- 1432-2064
- 0178-8051
-
-
- 909
- 1000
- 0960-1481
- 0960-1481
-
-
- 910
- 1001
- 0161-7567
- 0161-7567
-
-
-
-
911 rows × 3 columns
-
-
-
-
-
-```python
-# renomer les colonnes id
-journals = journals.rename(columns = {'id' : 'journal'})
-journals
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- issnl
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
-
-
- 1
- 2
- 0031-9007
- 0031-9007
-
-
- 2
- 3
- 1932-6203
- 1932-6203
-
-
- 3
- 4
- 2174-8454
- 2174-8454
-
-
- 4
- 5
- 1098-0121
- 1098-0121
-
-
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- 0964-1726
- 0964-1726
-
-
- 907
- 998
- 0022-3468
- 0022-3468
-
-
- 908
- 999
- 1432-2064
- 0178-8051
-
-
- 909
- 1000
- 0960-1481
- 0960-1481
-
-
- 910
- 1001
- 0161-7567
- 0161-7567
-
-
-
-
911 rows × 3 columns
-
-
-
-
-
-```python
-# test journals sans issn
-journals.loc[journals['issn'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- issnl
-
-
-
-
-
-
-
-
-
-
-```python
-journals.loc[journals['journal'] == 5]
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- issnl
-
-
-
-
- 4
- 5
- 1098-0121
- 1098-0121
-
-
-
-
-
-
-
-## Extraction du format
-
-
-```python
-# creation du DF
-col_names = ['issn',
- 'format'
- ]
-journals_format = pd.DataFrame(columns = col_names)
-journals_format
-```
-
-
-
-
-
-
-
-
-
-
- issn
- format
-
-
-
-
-
-
-
-
-
-
-```python
-# extraction des informations à partir des données ISSN.org
-for index, row in journals.iterrows():
- # myid = row['journal']
- myissn = row['issn']
- # myissnl = row['issnl']
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- # initialisation des variables à extraire
- myformat = np.nan
- # export en json
- if os.path.exists('issn/data/' + myissn + '.json'):
- with open('issn/data/' + myissn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- for x in data['@graph']:
- if ('@id' in x):
- if (x['@id'] == 'resource/ISSN/' + myissn):
- if ('format' in x):
- myformats = x['format']
- if type(myformats) is list:
- myformat = myformats[0].replace('vocabularies/medium#', '')
- else :
- myformat = myformats.replace('vocabularies/medium#', '')
- # journals_format.at[index,'journal'] = myid
- journals_format.at[index,'issn'] = myissn
- # journals2.at[index,'issnl'] = myissnl
- journals_format.at[index,'format'] = myformat
- else :
- print(row['issn'] + ' - pas trouvé')
-```
-
- 0
- 10
- 20
- 30
- 40
- 50
- 60
- 70
- 80
- 90
- 100
- 110
- 120
- 130
- 140
- 150
- 160
- 170
- 180
- 190
- 200
- 210
- 220
- 230
- 240
- 250
- 260
- 270
- 280
- 290
- 300
- 310
- 320
- 330
- 340
- 350
- 360
- 370
- 380
- 390
- 400
- 410
- 420
- 430
- 440
- 450
- 460
- 470
- 480
- 490
- 500
- 510
- 520
- 530
- 540
- 550
- 560
- 570
- 580
- 590
- 600
- 610
- 620
- 630
- 640
- 650
- 660
- 670
- 680
- 690
- 700
- 710
- 720
- 730
- 740
- 750
- 760
- 770
- 780
- 790
- 800
- 810
- 820
- 830
- 840
- 850
- 860
- 870
- 880
- 890
- 900
- 910
-
-
-
-```python
-journals_format
-```
-
-
-
-
-
-
-
-
-
-
- issn
- format
-
-
-
-
- 0
- 1660-9379
- Print
-
-
- 1
- 0031-9007
- Print
-
-
- 2
- 1932-6203
- Online
-
-
- 3
- 2174-8454
- Print
-
-
- 4
- 1098-0121
- Print
-
-
- ...
- ...
- ...
-
-
- 906
- 0964-1726
- Print
-
-
- 907
- 0022-3468
- Print
-
-
- 908
- 1432-2064
- Online
-
-
- 909
- 0960-1481
- Print
-
-
- 910
- 0161-7567
- Print
-
-
-
-
911 rows × 2 columns
-
-
-
-
-
-```python
-# test
-journals_format.loc[journals_format['format'].isnull()]
-```
-
-
-
-
-
-
-
-
-
-
- issn
- format
-
-
-
-
-
-
-
-
-
-
-```python
-journals_format['format'].value_counts()
-```
-
-
-
-
- Print 817
- Online 92
- Other 2
- Name: format, dtype: int64
-
-
-
-
-```python
-del journals['issn']
-```
-
-
-```python
-issns = pd.merge(issns, journals, on='issnl', how='outer')
-issns
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
-
-
-
-
- 0
- 0000-0019
- 0000-0019
- NaN
-
-
- 1
- 2150-4008
- 0000-0019
- NaN
-
-
- 2
- 0000-0027
- 0000-0027
- NaN
-
-
- 3
- 0000-0043
- 0000-0043
- NaN
-
-
- 4
- 0000-0051
- 0000-0051
- NaN
-
-
- ...
- ...
- ...
- ...
-
-
- 1995915
- 8756-9973
- 8756-9973
- NaN
-
-
- 1995916
- 8756-9981
- 8756-9981
- NaN
-
-
- 1995917
- 8756-999X
- 8756-999X
- NaN
-
-
- 1995918
- NaN
- 2624-8557
- 120.0
-
-
- 1995919
- NaN
- 0032-1052
- 936.0
-
-
-
-
1995920 rows × 3 columns
-
-
-
-
-
-```python
-# tester les lignes sans issn
-issns.loc[issns['issn'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
-
-
-
-
- 1995918
- NaN
- 2624-8557
- 120.0
-
-
- 1995919
- NaN
- 0032-1052
- 936.0
-
-
-
-
-
-
-
-
-```python
-# garder les lilgnes non null
-issns = issns.loc[issns['issn'].notna()]
-```
-
-
-```python
-# isoler les lignes avec marge
-issns2 = issns.loc[issns['journal'].notna()]
-issns2
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
-
-
-
-
- 334
- 0001-2815
- 0001-2815
- 532.0
-
-
- 335
- 1399-0039
- 0001-2815
- 532.0
-
-
- 493
- 0001-4842
- 0001-4842
- 498.0
-
-
- 494
- 1520-4898
- 0001-4842
- 498.0
-
-
- 505
- 0001-4966
- 0001-4966
- 789.0
-
-
- ...
- ...
- ...
- ...
-
-
- 1921352
- 2470-0045
- 2470-0045
- 533.0
-
-
- 1921353
- 2470-0053
- 2470-0045
- 533.0
-
-
- 1925740
- 2475-9953
- 2475-9953
- 608.0
-
-
- 1951854
- 2504-4427
- 2504-4427
- 994.0
-
-
- 1951855
- 2504-4435
- 2504-4427
- 994.0
-
-
-
-
1760 rows × 3 columns
-
-
-
-
-
-```python
-# ajout du format par ISSN
-issns2 = pd.merge(issns2, journals_format, on='issn', how='outer')
-issns2
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532.0
- Print
-
-
- 1
- 1399-0039
- 0001-2815
- 532.0
- NaN
-
-
- 2
- 0001-4842
- 0001-4842
- 498.0
- Print
-
-
- 3
- 1520-4898
- 0001-4842
- 498.0
- NaN
-
-
- 4
- 0001-4966
- 0001-4966
- 789.0
- Print
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1758
- 2504-4427
- 2504-4427
- 994.0
- Print
-
-
- 1759
- 2504-4435
- 2504-4427
- 994.0
- NaN
-
-
- 1760
- 2624-8557
- NaN
- NaN
- Online
-
-
- 1761
- 2469-9926
- NaN
- NaN
- Print
-
-
- 1762
- 1529-4242
- NaN
- NaN
- Online
-
-
-
-
1763 rows × 4 columns
-
-
-
-
-
-```python
-# isoler les lignes avec marge
-issns2 = issns2.loc[issns2['journal'].notna()]
-issns2
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532.0
- Print
-
-
- 1
- 1399-0039
- 0001-2815
- 532.0
- NaN
-
-
- 2
- 0001-4842
- 0001-4842
- 498.0
- Print
-
-
- 3
- 1520-4898
- 0001-4842
- 498.0
- NaN
-
-
- 4
- 0001-4966
- 0001-4966
- 789.0
- Print
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533.0
- Other
-
-
- 1756
- 2470-0053
- 2470-0045
- 533.0
- NaN
-
-
- 1757
- 2475-9953
- 2475-9953
- 608.0
- Online
-
-
- 1758
- 2504-4427
- 2504-4427
- 994.0
- Print
-
-
- 1759
- 2504-4435
- 2504-4427
- 994.0
- NaN
-
-
-
-
1760 rows × 4 columns
-
-
-
-
-
-```python
-issns2['format'] = issns2['format'].str.upper()
-issns2['format'] = issns2['format'].str.replace('ONLINE', 'ELECTRONIC')
-# DigitalCarrier
-issns2['format'] = issns2['format'].str.replace('DIGITALCARRIER', 'ELECTRONIC')
-issns2
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- """Entry point for launching an IPython kernel.
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:4: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- after removing the cwd from sys.path.
-
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532.0
- PRINT
-
-
- 1
- 1399-0039
- 0001-2815
- 532.0
- NaN
-
-
- 2
- 0001-4842
- 0001-4842
- 498.0
- PRINT
-
-
- 3
- 1520-4898
- 0001-4842
- 498.0
- NaN
-
-
- 4
- 0001-4966
- 0001-4966
- 789.0
- PRINT
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533.0
- OTHER
-
-
- 1756
- 2470-0053
- 2470-0045
- 533.0
- NaN
-
-
- 1757
- 2475-9953
- 2475-9953
- 608.0
- ELECTRONIC
-
-
- 1758
- 2504-4427
- 2504-4427
- 994.0
- PRINT
-
-
- 1759
- 2504-4435
- 2504-4427
- 994.0
- NaN
-
-
-
-
1760 rows × 4 columns
-
-
-
-
-
-```python
-issns2['format'].value_counts()
-```
-
-
-
-
- PRINT 816
- ELECTRONIC 90
- OTHER 2
- Name: format, dtype: int64
-
-
-
-
-```python
-# tester les lignes sans issn
-issns2.loc[issns2['format'].isnull()]
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
-
-
-
-
- 1
- 1399-0039
- 0001-2815
- 532.0
- NaN
-
-
- 3
- 1520-4898
- 0001-4842
- 498.0
- NaN
-
-
- 5
- 1520-8524
- 0001-4966
- 789.0
- NaN
-
-
- 6
- 1520-9024
- 0001-4966
- 789.0
- NaN
-
-
- 8
- 0942-0940
- 0001-6268
- 166.0
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1750
- 2469-9934
- 2469-9926
- 870.0
- NaN
-
-
- 1752
- 2469-9969
- 2469-9950
- 41.0
- NaN
-
-
- 1754
- 2470-0029
- 2470-0010
- 80.0
- NaN
-
-
- 1756
- 2470-0053
- 2470-0045
- 533.0
- NaN
-
-
- 1759
- 2504-4435
- 2504-4427
- 994.0
- NaN
-
-
-
-
852 rows × 4 columns
-
-
-
-
-
-```python
-# attribution de l'id du type
-# PRINT = 1
-# ELECTRONIC = 2
-# OTHER = 3
-issns2['issn_type'] = issns2['format']
-issns2['issn_type'] = issns2['issn_type'].str.replace('PRINT', '1')
-issns2['issn_type'] = issns2['issn_type'].str.replace('ELECTRONIC', '2')
-issns2['issn_type'] = issns2['issn_type'].str.replace('OTHER', '3')
-issns2['issn_type'] = issns2['issn_type'].fillna(3)
-issns2
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:5: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- """
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:6: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:7: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- import sys
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:8: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:9: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- if __name__ == '__main__':
-
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532.0
- PRINT
- 1
-
-
- 1
- 1399-0039
- 0001-2815
- 532.0
- NaN
- 3
-
-
- 2
- 0001-4842
- 0001-4842
- 498.0
- PRINT
- 1
-
-
- 3
- 1520-4898
- 0001-4842
- 498.0
- NaN
- 3
-
-
- 4
- 0001-4966
- 0001-4966
- 789.0
- PRINT
- 1
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533.0
- OTHER
- 3
-
-
- 1756
- 2470-0053
- 2470-0045
- 533.0
- NaN
- 3
-
-
- 1757
- 2475-9953
- 2475-9953
- 608.0
- ELECTRONIC
- 2
-
-
- 1758
- 2504-4427
- 2504-4427
- 994.0
- PRINT
- 1
-
-
- 1759
- 2504-4435
- 2504-4427
- 994.0
- NaN
- 3
-
-
-
-
1760 rows × 5 columns
-
-
-
-
-
-```python
-# convertir journal en int
-issns2['journal'] = issns2['journal'].astype(int)
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
-
-
-
-```python
-# convertir l'index en id
-issns2 = issns2.reset_index()
-issns2['id'] = issns2['index'] + 1
-del issns2['index']
-issns2
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
-
-
- 1
- 1399-0039
- 0001-2815
- 532
- NaN
- 3
- 2
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
-
-
- 3
- 1520-4898
- 0001-4842
- 498
- NaN
- 3
- 4
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
-
-
- 1756
- 2470-0053
- 2470-0045
- 533
- NaN
- 3
- 1757
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
-
-
-
-
1760 rows × 6 columns
-
-
-
-
-
-```python
-issns2['issn_type'] = issns2['issn_type'].astype(int)
-```
-
-
-```python
-# supprimer les doublons par ISSN
-issns2 = issns2.drop_duplicates(subset='issn')
-issns2
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
-
-
- 1
- 1399-0039
- 0001-2815
- 532
- NaN
- 3
- 2
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
-
-
- 3
- 1520-4898
- 0001-4842
- 498
- NaN
- 3
- 4
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
-
-
- 1756
- 2470-0053
- 2470-0045
- 533
- NaN
- 3
- 1757
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
-
-
-
-
1760 rows × 6 columns
-
-
-
-
-
-```python
-# export csv
-issns2.to_csv('sample/issn_brut.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-issns2.to_excel('sample/issn_brut.xlsx', index=False)
-```
-
-
-```python
-# export CSV des IDs
-issns2[['id', 'issn', 'issnl', 'journal']].to_csv('sample/issn_ids.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel des IDs
-issns2[['id', 'issn', 'issnl', 'journal']].to_excel('sample/issn_ids.xlsx', index=False)
-```
diff --git a/import_scripts/05_oacct_issns.py b/import_scripts/05_oacct_issns.py
deleted file mode 100644
index af282efb..00000000
--- a/import_scripts/05_oacct_issns.py
+++ /dev/null
@@ -1,280 +0,0 @@
-#!/usr/bin/env python
-# coding: utf-8
-
-# # Projet Open Access Compliance Check Tool (OACCT)
-#
-# Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-#
-# Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-#
-# Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-# Date de dernière mise à jour : 16.07.2021
-
-# ## Table ISSNs
-
-# In[1]:
-
-
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-
-
-# In[2]:
-
-
-# ajout des ISSN-L
-issns = pd.read_csv('issn/20171102.ISSN-to-ISSN-L.txt', encoding='utf-8', header=0, sep='\t')
-issns
-
-
-# In[3]:
-
-
-# renommer les colonnes
-issns = issns.rename(columns={'ISSN' : 'issn', 'ISSN-L' : 'issnl'})
-issns
-
-
-# In[4]:
-
-
-journals = pd.read_csv('sample/journals_brut.tsv', encoding='utf-8', sep='\t', usecols=(['id', 'issn', 'issnl']))
-journals
-
-
-# In[5]:
-
-
-# renomer les colonnes id
-journals = journals.rename(columns = {'id' : 'journal'})
-journals
-
-
-# In[6]:
-
-
-# test journals sans issn
-journals.loc[journals['issn'].isna()]
-
-
-# In[7]:
-
-
-journals.loc[journals['journal'] == 5]
-
-
-# ## Extraction du format
-
-# In[8]:
-
-
-# creation du DF
-col_names = ['issn',
- 'format'
- ]
-journals_format = pd.DataFrame(columns = col_names)
-journals_format
-
-
-# In[9]:
-
-
-# extraction des informations à partir des données ISSN.org
-for index, row in journals.iterrows():
- # myid = row['journal']
- myissn = row['issn']
- # myissnl = row['issnl']
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- # initialisation des variables à extraire
- myformat = np.nan
- # export en json
- if os.path.exists('issn/data/' + myissn + '.json'):
- with open('issn/data/' + myissn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- for x in data['@graph']:
- if ('@id' in x):
- if (x['@id'] == 'resource/ISSN/' + myissn):
- if ('format' in x):
- myformats = x['format']
- if type(myformats) is list:
- myformat = myformats[0].replace('vocabularies/medium#', '')
- else :
- myformat = myformats.replace('vocabularies/medium#', '')
- # journals_format.at[index,'journal'] = myid
- journals_format.at[index,'issn'] = myissn
- # journals2.at[index,'issnl'] = myissnl
- journals_format.at[index,'format'] = myformat
- else :
- print(row['issn'] + ' - pas trouvé')
-
-
-# In[10]:
-
-
-journals_format
-
-
-# In[11]:
-
-
-# test
-journals_format.loc[journals_format['format'].isnull()]
-
-
-# In[12]:
-
-
-journals_format['format'].value_counts()
-
-
-# In[13]:
-
-
-del journals['issn']
-
-
-# In[14]:
-
-
-issns = pd.merge(issns, journals, on='issnl', how='outer')
-issns
-
-
-# In[15]:
-
-
-# tester les lignes sans issn
-issns.loc[issns['issn'].isna()]
-
-
-# In[16]:
-
-
-# garder les lilgnes non null
-issns = issns.loc[issns['issn'].notna()]
-
-
-# In[17]:
-
-
-# isoler les lignes avec marge
-issns2 = issns.loc[issns['journal'].notna()]
-issns2
-
-
-# In[18]:
-
-
-# ajout du format par ISSN
-issns2 = pd.merge(issns2, journals_format, on='issn', how='outer')
-issns2
-
-
-# In[19]:
-
-
-# isoler les lignes avec marge
-issns2 = issns2.loc[issns2['journal'].notna()]
-issns2
-
-
-# In[20]:
-
-
-issns2['format'] = issns2['format'].str.upper()
-issns2['format'] = issns2['format'].str.replace('ONLINE', 'ELECTRONIC')
-# DigitalCarrier
-issns2['format'] = issns2['format'].str.replace('DIGITALCARRIER', 'ELECTRONIC')
-issns2
-
-
-# In[21]:
-
-
-issns2['format'].value_counts()
-
-
-# In[22]:
-
-
-# tester les lignes sans issn
-issns2.loc[issns2['format'].isnull()]
-
-
-# In[23]:
-
-
-# attribution de l'id du type
-# PRINT = 1
-# ELECTRONIC = 2
-# OTHER = 3
-issns2['issn_type'] = issns2['format']
-issns2['issn_type'] = issns2['issn_type'].str.replace('PRINT', '1')
-issns2['issn_type'] = issns2['issn_type'].str.replace('ELECTRONIC', '2')
-issns2['issn_type'] = issns2['issn_type'].str.replace('OTHER', '3')
-issns2['issn_type'] = issns2['issn_type'].fillna(3)
-issns2
-
-
-# In[24]:
-
-
-# convertir journal en int
-issns2['journal'] = issns2['journal'].astype(int)
-
-
-# In[25]:
-
-
-# convertir l'index en id
-issns2 = issns2.reset_index()
-issns2['id'] = issns2['index'] + 1
-del issns2['index']
-issns2
-
-
-# In[26]:
-
-
-issns2['issn_type'] = issns2['issn_type'].astype(int)
-
-
-# In[27]:
-
-
-# supprimer les doublons par ISSN
-issns2 = issns2.drop_duplicates(subset='issn')
-issns2
-
-
-# In[28]:
-
-
-# export csv
-issns2.to_csv('sample/issn_brut.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[29]:
-
-
-# export excel
-issns2.to_excel('sample/issn_brut.xlsx', index=False)
-
-
-# In[30]:
-
-
-# export CSV des IDs
-issns2[['id', 'issn', 'issnl', 'journal']].to_csv('sample/issn_ids.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[31]:
-
-
-# export excel des IDs
-issns2[['id', 'issn', 'issnl', 'journal']].to_excel('sample/issn_ids.xlsx', index=False)
-
diff --git a/import_scripts/06_oacct_sherpa.md b/import_scripts/06_oacct_sherpa.md
deleted file mode 100644
index b7077461..00000000
--- a/import_scripts/06_oacct_sherpa.md
+++ /dev/null
@@ -1,9819 +0,0 @@
-# Projet Open Access Compliance Check Tool (OACCT)
-
-Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-
-Ce notebook permet d'extraire les données de Sherpa/Romeo obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-
-Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-Date de dernière mise à jour : 16.07.2021
-
-## Données de Sherpa/Romeo
-
-### Exemple
-
-https://v2.sherpa.ac.uk/cgi/retrieve_by_id?item-type=publication&api-key=EEE6F146-678E-11EB-9C3A-202F3DE2659A&format=Json&identifier=17601
-
-
-```python
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-# afficher toutes les colonnes
-pd.set_option('display.max_columns', None)
-```
-
-## Table publisher_sherpa
-
-
-```python
-# creation du DF
-col_names = ['journal',
- 'publisher_id',
- 'name',
- 'country',
- 'type',
- 'url'
- ]
-publisher_sherpa = pd.DataFrame(columns = col_names)
-publisher_sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_id
- name
- country
- type
- url
-
-
-
-
-
-
-
-
-
-## Table sherpa match issn
-
-
-```python
-# creation du DF
-col_names = ['issn',
- 'sherpa_match',
- ]
-sherpa_match_issn = pd.DataFrame(columns = col_names)
-sherpa_match_issn
-```
-
-
-
-
-
-
-
-
-
-
- issn
- sherpa_match
-
-
-
-
-
-
-
-
-
-## Table sherpa issns
-
-
-```python
-# creation du DF
-col_names = ['issn',
- 'type',
- ]
-sherpa_issn = pd.DataFrame(columns = col_names)
-sherpa_issn
-```
-
-
-
-
-
-
-
-
-
-
- issn
- type
-
-
-
-
-
-
-
-
-
-## Table sherpa journals
-
-
-```python
-# creation du DF
-col_names = ['journal',
- 'title',
- 'url',
- ]
-sherpa_journal = pd.DataFrame(columns = col_names)
-sherpa_journal
-```
-
-
-
-
-
-
-
-
-
-
- journal
- title
- url
-
-
-
-
-
-
-
-
-
-## Import table Journals et ISSN
-
-
-```python
-journal = pd.read_csv('sample/journals_publishers_brut.tsv', encoding='utf-8', header=0, sep='\t')
-journal
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title
- starting_year
- end_year
- url
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
- doublon_issnl
- oa_status
- publisher
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 1
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
- 2
-
-
- 2
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1.0
- Yes
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
- 0.0
- 4.035714
- NaN
- 5
- 3
-
-
- 3
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 4, 5
-
-
- 4
- 5
- 1098-0121
- 1098-0121
- Physical review. B, Condensed matter and mater...
- 1998
- 2015
- http://ojps.aip.org/prbo/
- Phys. rev., B, Condens. matter mater. phys.
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
- 6
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- 0964-1726
- 0964-1726
- Smart materials and structures (Print)
- 1992
- 9999
- NaN
- Smart mater. struct. (Print)
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 47
-
-
- 907
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 75
-
-
- 908
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
- NaN
- 1
- 8
-
-
- 909
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 119
-
-
- 910
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 217
-
-
-
-
911 rows × 24 columns
-
-
-
-
-
-```python
-issn = pd.read_csv('sample/issn_brut.tsv', encoding='utf-8', header=0, sep='\t')
-issn
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
-
-
- 1
- 1399-0039
- 0001-2815
- 532
- NaN
- 3
- 2
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
-
-
- 3
- 1520-4898
- 0001-4842
- 498
- NaN
- 3
- 4
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
-
-
- 1756
- 2470-0053
- 2470-0045
- 533
- NaN
- 3
- 1757
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
-
-
-
-
1760 rows × 6 columns
-
-
-
-
-
-```python
-issn_ids = pd.read_csv('sample/issn_ids.tsv', encoding='utf-8', header=0, sep='\t')
-issn_ids
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- journal
-
-
-
-
- 0
- 1
- 0001-2815
- 0001-2815
- 532
-
-
- 1
- 2
- 1399-0039
- 0001-2815
- 532
-
-
- 2
- 3
- 0001-4842
- 0001-4842
- 498
-
-
- 3
- 4
- 1520-4898
- 0001-4842
- 498
-
-
- 4
- 5
- 0001-4966
- 0001-4966
- 789
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 1756
- 2470-0045
- 2470-0045
- 533
-
-
- 1756
- 1757
- 2470-0053
- 2470-0045
- 533
-
-
- 1757
- 1758
- 2475-9953
- 2475-9953
- 608
-
-
- 1758
- 1759
- 2504-4427
- 2504-4427
- 994
-
-
- 1759
- 1760
- 2504-4435
- 2504-4427
- 994
-
-
-
-
1760 rows × 4 columns
-
-
-
-
-## Extraction de Sherpa Romeo
-
-
-```python
-# extraction des informations à partir des données Sherpa/Romeo
-for index, row in issn.iterrows():
- journal_id = row['journal']
- journal_issn = row['issn']
- # if (((index/10) - int(index/10)) == 0) :
- # print(index)
- # initialisation des variables à extraire
- publisher_id = np.nan
- publisher_name = ''
- publisher_country = ''
- publisher_type = ''
- publisher_url = ''
- # boucle des fichiers json
- # test d'existance du fichier
- # print(row['issn'])
- if os.path.exists('sherpa/data/' + journal_issn + '.json'):
- with open('sherpa/data/' + journal_issn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- if (len(data['items']) > 0):
- publisher_id = data['items'][0]['publishers'][0]['publisher']['id']
- if ('country' in data['items'][0]['publishers'][0]['publisher']):
- publisher_country = data['items'][0]['publishers'][0]['publisher']['country']
- if ('relationship_type' in data['items'][0]['publishers'][0]):
- publisher_type = data['items'][0]['publishers'][0]['relationship_type']
- if ('url' in data['items'][0]['publishers'][0]['publisher']):
- publisher_url = data['items'][0]['publishers'][0]['publisher']['url']
- if ('name' in data['items'][0]['publishers'][0]['publisher']['name'][0]):
- publisher_name = data['items'][0]['publishers'][0]['publisher']['name'][0]['name']
- sherpa_match = 'OK'
- publisher_sherpa = publisher_sherpa.append({'journal' : journal_id, 'publisher_id' : publisher_id,
- 'name' : publisher_name, 'country' : publisher_country,
- 'type' : publisher_type, 'url' : publisher_url}, ignore_index=True)
- else :
- print(row['issn'] + ' - trouvé mais vide')
- sherpa_match = 'empty'
- else :
- print(row['issn'] + ' - pas trouvé')
- sherpa_match = 'missing'
- sherpa_match_issn = sherpa_match_issn.append({'issn' : row['issn'], 'sherpa_match' : sherpa_match}, ignore_index=True)
-```
-
- 1399-0039 - pas trouvé
- 1520-8524 - trouvé mais vide
- 1520-9024 - pas trouvé
- 1468-2834 - pas trouvé
- 1551-2916 - pas trouvé
- 1943-2984 - pas trouvé
- 1555-7162 - trouvé mais vide
- 2163-5773 - pas trouvé
- 1873-4324 - trouvé mais vide
- 1526-7598 - pas trouvé
- 1673-3134 - pas trouvé
- 1777-5884 - pas trouvé
- 1528-1140 - pas trouvé
- 1468-2060 - pas trouvé
- 1552-6259 - pas trouvé
- 0003-6935 - trouvé mais vide
- 1520-8842 - pas trouvé
- 0003-9926 - trouvé mais vide
- 1538-3679 - pas trouvé
- 0003-9942 - trouvé mais vide
- 1538-3687 - pas trouvé
- 1529-0131 - pas trouvé
- 1090-2104 - trouvé mais vide
- 1943-295X - pas trouvé
- 1878-2434 - pas trouvé
- 1873-2402 - trouvé mais vide
- 1872-6240 - trouvé mais vide
- 1365-2133 - pas trouvé
- 0007-4403 - trouvé mais vide
- 1968-3766 - pas trouvé
- 0008-042X - trouvé mais vide
- 2104-3329 - pas trouvé
- 2268-7963 - pas trouvé
- 1873-3948 - trouvé mais vide
- 1873-4405 - trouvé mais vide
- 1872-6836 - trouvé mais vide
- 1873-4448 - trouvé mais vide
- 1524-4571 - trouvé mais vide
- 1873-7838 - trouvé mais vide
- 1879-2944 - trouvé mais vide
- 1873-3840 - trouvé mais vide
- 1973-8102 - trouvé mais vide
- 0011-1600 - trouvé mais vide
- 1968-3901 - pas trouvé
- 1879-2235 - trouvé mais vide
- 1095-564X - trouvé mais vide
- 1931-3543 - pas trouvé
- 1385-013X - trouvé mais vide
- 1873-3859 - trouvé mais vide
- 1873-7315 - trouvé mais vide
- 0013-8584 - trouvé mais vide
- 2309-4672 - pas trouvé
- 0014-2239 - trouvé mais vide
- 2272-9011 - pas trouvé
- 0945-5795 - pas trouvé
- 1432-1033 - pas trouvé
- 1365-2362 - pas trouvé
- 1090-2422 - trouvé mais vide
- 1026-7484 - trouvé mais vide
- 1528-0012 - trouvé mais vide
- 1872-9533 - trouvé mais vide
- 0016-9161 - trouvé mais vide
- 2297-7953 - pas trouvé
- 1879-2189 - trouvé mais vide
- 0018-0238 - trouvé mais vide
- 2297-1971 - pas trouvé
- 2334-3303 - pas trouvé
- 1070-6313 - pas trouvé
- 1873-3255 - trouvé mais vide
- 1097-0215 - pas trouvé
- 1879-2146 - trouvé mais vide
- 0021-8170 - trouvé mais vide
- 2114-6292 - pas trouvé
- 1090-266X - trouvé mais vide
- 1520-8850 - trouvé mais vide
- 1879-1484 - trouvé mais vide
- 1067-8832 - pas trouvé
- 1067-8816 - pas trouvé
- 1873-2380 - trouvé mais vide
- 1090-2694 - trouvé mais vide
- 1520-9032 - pas trouvé
- 1873-3778 - trouvé mais vide
- 1945-7197 - pas trouvé
- 0021-9797 - trouvé mais vide
- 1090-2716 - trouvé mais vide
- 1873-5002 - pas trouvé
- 0022-0728 - trouvé mais vide
- 1879-2707 - trouvé mais vide
- 1872-7883 - trouvé mais vide
- 1527-2427 - trouvé mais vide
- 1089-8638 - trouvé mais vide
- 1873-4820 - trouvé mais vide
- 1872-8561 - trouvé mais vide
- 1531-5037 - trouvé mais vide
- 1085-8695 - pas trouvé
- 1097-6833 - pas trouvé
- 1879-2553 - trouvé mais vide
- 1097-6841 - pas trouvé
- 2050-5639 - pas trouvé
- 1873-4782 - trouvé mais vide
- 1878-5883 - trouvé mais vide
- 1085-8687 - pas trouvé
- 1097-685X - pas trouvé
- 1070-6321 - pas trouvé
- 1091-756X - pas trouvé
- 1939-5590 - trouvé mais vide
- 1939-5604 - pas trouvé
- 1873-1856 - trouvé mais vide
- 1872-6143 - pas trouvé
- 0025-6749 - trouvé mais vide
- 1423-0356 - pas trouvé
- 0026-4598 - pas trouvé
- 1432-1874 - pas trouvé
- 0027-4054 - trouvé mais vide
- 1873-3514 - trouvé mais vide
- 1873-0310 - trouvé mais vide
- 1872-616X - pas trouvé
- 1402-4896 - pas trouvé
- 0031-8965 - trouvé mais vide
- 1521-396X - pas trouvé
- 1092-0145 - trouvé mais vide
- 1873-3700 - pas trouvé
- 1532-2548 - pas trouvé
- 1527-2400 - trouvé mais vide
- 0035-1121 - trouvé mais vide
- 1760-7426 - pas trouvé
- 0035-1784 - trouvé mais vide
- 2297-1254 - pas trouvé
- 0035-3655 - trouvé mais vide
- 2104-385X - pas trouvé
- 0036-7486 - trouvé mais vide
- 1424-4004 - trouvé mais vide
- 0036-7672 - trouvé mais vide
- 0036-7699 - trouvé mais vide
- 0036-7893 - trouvé mais vide
- 2504-1452 - pas trouvé
- 1471-1257 - pas trouvé
- 1879-2766 - trouvé mais vide
- 1879-2405 - trouvé mais vide
- 1879-2758 - trouvé mais vide
- 1464-5416 - pas trouvé
- 1873-3581 - pas trouvé
- 1664-2864 - pas trouvé
- 1879-2731 - pas trouvé
- 1534-6080 - trouvé mais vide
- 1873-2623 - pas trouvé
- 1096-0341 - trouvé mais vide
- 1878-5646 - trouvé mais vide
- 1879-2448 - pas trouvé
- 1879-1298 - trouvé mais vide
- 1879-2138 - trouvé mais vide
- 0046-2497 - trouvé mais vide
- 1776-2936 - pas trouvé
- 1873-7625 - trouvé mais vide
- 1879-2472 - pas trouvé
- 2214-8019 - trouvé mais vide
- 0065-7727 - trouvé mais vide
- 1070-6283 - pas trouvé
- 0066-6653 - trouvé mais vide
- 0072-0585 - trouvé mais vide
- 1079-2376 - pas trouvé
- 1557-7988 - trouvé mais vide
- 0081-1254 - trouvé mais vide
- 1523-1755 - pas trouvé
- 1085-8725 - pas trouvé
- 1097-6825 - trouvé mais vide
- 1096-0260 - pas trouvé
- 1522-8541 - pas trouvé
- 1551-7616 - pas trouvé
- 1935-0465 - pas trouvé
- 1070-633X - pas trouvé
- 1873-4375 - trouvé mais vide
- 1070-6291 - pas trouvé
- 0108-2701 - trouvé mais vide
- 1600-5759 - pas trouvé
- 1879-0097 - pas trouvé
- 1879-2081 - pas trouvé
- 1873-7323 - trouvé mais vide
- 1879-3452 - trouvé mais vide
- 1878-5905 - trouvé mais vide
- 1532-1991 - pas trouvé
- 1071-2763 - pas trouvé
- 1071-8842 - pas trouvé
- 2156-2202 - pas trouvé
- 1081-1281 - pas trouvé
- 1873-7528 - trouvé mais vide
- 1773-0406 - trouvé mais vide
- 0151-0193 - trouvé mais vide
- 2101-0218 - trouvé mais vide
- 0161-7567 - trouvé mais vide
- 2160-9292 - trouvé mais vide
- 1095-3795 - trouvé mais vide
- 1872-678X - trouvé mais vide
- 1573-2517 - pas trouvé
- 1872-7557 - trouvé mais vide
- 1872-7123 - trouvé mais vide
- 1872-7441 - trouvé mais vide
- 1872-7999 - pas trouvé
- 1879-1514 - pas trouvé
- 1874-1754 - trouvé mais vide
- 1872-7697 - trouvé mais vide
- 1873-5568 - trouvé mais vide
- 1872-7352 - pas trouvé
- 1872-9584 - trouvé mais vide
- 1600-0641 - trouvé mais vide
- 1872-9576 - trouvé mais vide
- 1873-5460 - pas trouvé
- 1873-5584 - trouvé mais vide
- 1872-695X - pas trouvé
- 1432-0827 - pas trouvé
- 1432-1262 - pas trouvé
- 0181-5512 - trouvé mais vide
- 1773-0597 - pas trouvé
- 1879-2367 - trouvé mais vide
- 1532-2939 - trouvé mais vide
- 1527-3296 - pas trouvé
- 1558-1497 - trouvé mais vide
- 0221-5918 - trouvé mais vide
- 0248-8663 - trouvé mais vide
- 1768-3122 - trouvé mais vide
- 0252-1881 - trouvé mais vide
- 0252-2969 - trouvé mais vide
- 1661-5468 - pas trouvé
- 0254-945X - trouvé mais vide
- 1662-9760 - pas trouvé
- 0255-9005 - trouvé mais vide
- 0258-6800 - trouvé mais vide
- 1432-0819 - pas trouvé
- 0259-6199 - trouvé mais vide
- 1661-3171 - trouvé mais vide
- 1532-1983 - pas trouvé
- 1873-2518 - trouvé mais vide
- 1365-2346 - pas trouvé
- 1476-5365 - pas trouvé
- 1067-8824 - pas trouvé
- 0271-4302 - trouvé mais vide
- 2158-1525 - pas trouvé
- 1536-4801 - pas trouvé
- 1873-457X - pas trouvé
- 1531-5053 - pas trouvé
- 1470-8752 - pas trouvé
- 1879-176X - pas trouvé
- 1873-4421 - pas trouvé
- 1432-1998 - pas trouvé
- 1873-6246 - pas trouvé
- 1873-6777 - pas trouvé
- 1879-3533 - trouvé mais vide
- 1872-8057 - trouvé mais vide
- 1872-7972 - trouvé mais vide
- 1879-2723 - trouvé mais vide
- 1879-2774 - pas trouvé
- 1873-4766 - trouvé mais vide
- 1362-4954 - pas trouvé
- 1365-2842 - pas trouvé
- 1361-6447 - trouvé mais vide
- 1872-9118 - trouvé mais vide
- 1873-7544 - trouvé mais vide
- 1873-3360 - pas trouvé
- 1873-2100 - pas trouvé
- 1872-9657 - trouvé mais vide
- 1499-2752 - pas trouvé
- 2567-689X - trouvé mais vide
- 1432-1238 - pas trouvé
- 1873-684X - trouvé mais vide
- 1879-355X - trouvé mais vide
- 1879-3487 - trouvé mais vide
- 1873-6785 - trouvé mais vide
- 1546-3141 - pas trouvé
- 0362-1340 - trouvé mais vide
- 1523-2867 - pas trouvé
- 1558-1160 - trouvé mais vide
- 1432-2323 - pas trouvé
- 0365-7116 - trouvé mais vide
- 1873-2526 - pas trouvé
- 0368-4466 - trouvé mais vide
- 1588-2926 - pas trouvé
- 0369-3392 - trouvé mais vide
- 1873-2445 - trouvé mais vide
- 0373-2525 - trouvé mais vide
- 0373-2967 - trouvé mais vide
- 2235-3658 - pas trouvé
- 0373-6156 - trouvé mais vide
- 2391-1336 - pas trouvé
- 0374-4256 - trouvé mais vide
- 0375-1457 - trouvé mais vide
- 2419-8196 - pas trouvé
- 1873-2429 - trouvé mais vide
- 1872-6097 - pas trouvé
- 1872-6860 - trouvé mais vide
- 1574-6968 - pas trouvé
- 1879-0038 - trouvé mais vide
- 1873-3476 - trouvé mais vide
- 1873-2755 - trouvé mais vide
- 1872-6178 - trouvé mais vide
- 1873-2046 - trouvé mais vide
- 1872-6283 - trouvé mais vide
- 0398-3412 - trouvé mais vide
- 2297-5810 - pas trouvé
- 0409-8757 - trouvé mais vide
- 1461-7412 - pas trouvé
- 1873-1562 - trouvé mais vide
- 1089-4918 - trouvé mais vide
- 1538-4500 - pas trouvé
- 0570-0833 - trouvé mais vide
- 0583-8401 - trouvé mais vide
- 1872-7727 - trouvé mais vide
- 1873-264X - trouvé mais vide
- 1527-7755 - pas trouvé
- 1520-8559 - trouvé mais vide
- 1558-3597 - trouvé mais vide
- 1873-5134 - pas trouvé
- 1096-3677 - pas trouvé
- 2213-0276 - pas trouvé
- 1958-5381 - pas trouvé
- 1651-2227 - pas trouvé
- 0884-1616 - trouvé mais vide
- 1091-8876 - pas trouvé
- 1092-8928 - pas trouvé
- 1089-8646 - pas trouvé
- 0888-8809 - trouvé mais vide
- 1944-9917 - trouvé mais vide
- 1532-0987 - pas trouvé
- 0894-8275 - trouvé mais vide
- 1878-5921 - pas trouvé
- 1520-636X - pas trouvé
- 1399-3038 - pas trouvé
- 1873-7196 - trouvé mais vide
- 1873-4308 - trouvé mais vide
- 1573-2509 - trouvé mais vide
- 1879-0658 - trouvé mais vide
- 1873-2135 - pas trouvé
- 1873-2143 - pas trouvé
- 1873-4936 - trouvé mais vide
- 1873-4944 - pas trouvé
- 1872-793X - trouvé mais vide
- 1873-3069 - pas trouvé
- 1872-8286 - trouvé mais vide
- 1873-3077 - pas trouvé
- 1873-4669 - trouvé mais vide
- 1873-3883 - trouvé mais vide
- 0926-9630 - trouvé mais vide
- 1879-8365 - trouvé mais vide
- 1879-3398 - trouvé mais vide
- 1873-4359 - trouvé mais vide
- 1879-0720 - trouvé mais vide
- 1769-664X - pas trouvé
- 1432-2218 - pas trouvé
- 1866-6817 - pas trouvé
- 1432-2277 - pas trouvé
- 1435-4373 - pas trouvé
- 1433-2965 - pas trouvé
- 1873-3441 - pas trouvé
- 1362-3044 - pas trouvé
- 1879-0526 - trouvé mais vide
- 1879-0828 - pas trouvé
- 1879-0410 - trouvé mais vide
- 1873-619X - trouvé mais vide
- 1873-4235 - trouvé mais vide
- 1362-511X - pas trouvé
- 1879-0429 - trouvé mais vide
- 1879-1786 - trouvé mais vide
- 1879-0852 - pas trouvé
- 1879-0682 - pas trouvé
- 1873-2976 - trouvé mais vide
- 1464-3405 - trouvé mais vide
- 1466-1861 - pas trouvé
- 1555-3892 - pas trouvé
- 1360-0443 - pas trouvé
- 1464-3391 - trouvé mais vide
- 1879-2359 - pas trouvé
- 0992-986X - trouvé mais vide
- 2119-4130 - pas trouvé
- 0995-3817 - trouvé mais vide
- 2219-2840 - pas trouvé
- 1010-2248 - trouvé mais vide
- 1664-9885 - pas trouvé
- 1873-2666 - pas trouvé
- 1017-0588 - trouvé mais vide
- 1018-7987 - trouvé mais vide
- 1019-0406 - trouvé mais vide
- 1023-2044 - trouvé mais vide
- 1023-9332 - trouvé mais vide
- 2235-1884 - pas trouvé
- 1560-7917 - pas trouvé
- 1026-7530 - pas trouvé
- 1607-8489 - pas trouvé
- 1127-2236 - pas trouvé
- 1938-808X - pas trouvé
- 1095-8657 - trouvé mais vide
- 1536-3732 - pas trouvé
- 1049-5258 - trouvé mais vide
- 1538-4446 - pas trouvé
- 1095-9572 - trouvé mais vide
- 1532-6500 - trouvé mais vide
- 1059-1524 - trouvé mais vide
- 1095-3787 - trouvé mais vide
- 1538-4519 - trouvé mais vide
- 1063-6919 - trouvé mais vide
- 2332-564X - pas trouvé
- 2575-7075 - pas trouvé
- 1940-6029 - trouvé mais vide
- 1527-2435 - pas trouvé
- 1527-2419 - pas trouvé
- 1071-1023 - trouvé mais vide
- 1520-8567 - pas trouvé
- 1090-235X - trouvé mais vide
- 1532-2130 - pas trouvé
- 1096-0856 - trouvé mais vide
- 1538-4489 - pas trouvé
- 1155-4339 - trouvé mais vide
- 1764-7177 - pas trouvé
- 1460-9592 - pas trouvé
- 1878-3511 - pas trouvé
- 1778-7254 - pas trouvé
- 1873-4030 - pas trouvé
- 1873-2844 - trouvé mais vide
- 1873-5126 - trouvé mais vide
- 1873-5606 - pas trouvé
- 1873-2453 - trouvé mais vide
- 1872-8456 - pas trouvé
- 2040-2058 - pas trouvé
- 1878-5840 - trouvé mais vide
- 1473-6519 - pas trouvé
- 1879-0690 - trouvé mais vide
- 1466-609X - pas trouvé
- 1367-4811 - trouvé mais vide
- 1873-4286 - pas trouvé
- 1873-3212 - trouvé mais vide
- 1873-1759 - pas trouvé
- 1875-8908 - trouvé mais vide
- 1872-8952 - trouvé mais vide
- 1873-1902 - trouvé mais vide
- 1600-0854 - pas trouvé
- 1420-5556 - trouvé mais vide
- 1420-7192 - trouvé mais vide
- 1662-0879 - pas trouvé
- 1422-2019 - trouvé mais vide
- 1422-3449 - trouvé mais vide
- 1422-5778 - trouvé mais vide
- 2504-1436 - pas trouvé
- 1423-3967 - trouvé mais vide
- 1663-3997 - pas trouvé
- 1424-1811 - trouvé mais vide
- 2504-1460 - pas trouvé
- 1424-4020 - pas trouvé
- 1424-7410 - trouvé mais vide
- 1424-7755 - trouvé mais vide
- 1436-3771 - pas trouvé
- 1434-6028 - trouvé mais vide
- 1434-6036 - trouvé mais vide
- 1439-4456 - pas trouvé
- 1449-8979 - pas trouvé
- 1873-6416 - trouvé mais vide
- 1465-6914 - trouvé mais vide
- 1478-6362 - pas trouvé
- 1520-6149 - trouvé mais vide
- 2379-190X - trouvé mais vide
- 1522-1601 - pas trouvé
- 1708-8208 - pas trouvé
- 1944-7884 - pas trouvé
- 1527-6473 - pas trouvé
- 1947-3893 - pas trouvé
- 1530-1591 - trouvé mais vide
- 1558-1101 - pas trouvé
- 1860-2002 - pas trouvé
- 1552-5279 - pas trouvé
- 1557-170X - trouvé mais vide
- 1878-5530 - trouvé mais vide
- 1878-1519 - trouvé mais vide
- 1569-9293 - pas trouvé
- 1873-376X - pas trouvé
- 1720-8319 - pas trouvé
- 1610-0379 - trouvé mais vide
- 1610-0387 - pas trouvé
- 1778-3569 - trouvé mais vide
- 1660-3362 - trouvé mais vide
- 1660-9379 - trouvé mais vide
- 1660-9603 - trouvé mais vide
- 1661-1179 - trouvé mais vide
- 1661-2620 - trouvé mais vide
- 1661-464X - trouvé mais vide
- 1661-4941 - trouvé mais vide
- 1661-8165 - pas trouvé
- 1662-551X - pas trouvé
- 1662-5536 - trouvé mais vide
- 1662-6001 - trouvé mais vide
- 1662-601X - pas trouvé
- 1662-8705 - trouvé mais vide
- 1777-5477 - trouvé mais vide
- 1810-7621 - pas trouvé
- 1863-2300 - pas trouvé
- 1873-2763 - trouvé mais vide
- 1876-7737 - pas trouvé
- 1878-8769 - trouvé mais vide
- 1939-5175 - trouvé mais vide
- 1945-7928 - trouvé mais vide
- 1945-7936 - pas trouvé
- 1945-8452 - trouvé mais vide
- 1992-2655 - trouvé mais vide
- 2050-7534 - trouvé mais vide
- 2101-6275 - pas trouvé
- 2161-2129 - pas trouvé
- 2160-5033 - trouvé mais vide
- 2160-5041 - pas trouvé
- 2160-9020 - trouvé mais vide
- 2160-9047 - pas trouvé
- 2164-3342 - trouvé mais vide
- 2174-8454 - trouvé mais vide
- 2340-115X - pas trouvé
- 2211-3282 - trouvé mais vide
- 2264-7228 - trouvé mais vide
- 2297-0703 - trouvé mais vide
- 2297-6981 - trouvé mais vide
- 2297-7007 - pas trouvé
- 2352-1791 - trouvé mais vide
- 2504-4427 - trouvé mais vide
- 2504-4435 - trouvé mais vide
-
-
-
-```python
-publisher_sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_id
- name
- country
- type
- url
-
-
-
-
- 0
- 532
- 45
- John Wiley and Sons
- gb
- former_publisher
- http://www.wiley.com/
-
-
- 1
- 498
- 4
- American Chemical Society
- us
- society_publisher
- http://pubs.acs.org/
-
-
- 2
- 498
- 4
- American Chemical Society
- us
- society_publisher
- http://pubs.acs.org/
-
-
- 3
- 789
- 126
- Acoustical Society of America
- us
- society_publisher
- http://acousticalsociety.org/
-
-
- 4
- 166
- 3291
- Springer
- gb
- commercial_publisher
- https://www.springernature.com/gp/products/jou...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1238
- 80
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 1239
- 80
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 1240
- 533
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 1241
- 533
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 1242
- 608
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
-
-
1243 rows × 6 columns
-
-
-
-
-
-```python
-sherpa_match_issn
-```
-
-
-
-
-
-
-
-
-
-
- issn
- sherpa_match
-
-
-
-
- 0
- 0001-2815
- OK
-
-
- 1
- 1399-0039
- missing
-
-
- 2
- 0001-4842
- OK
-
-
- 3
- 1520-4898
- OK
-
-
- 4
- 0001-4966
- OK
-
-
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- OK
-
-
- 1756
- 2470-0053
- OK
-
-
- 1757
- 2475-9953
- OK
-
-
- 1758
- 2504-4427
- empty
-
-
- 1759
- 2504-4435
- empty
-
-
-
-
1760 rows × 2 columns
-
-
-
-
-
-```python
-# dedup
-publisher_sherpa_dedup = publisher_sherpa.drop_duplicates()
-publisher_sherpa_dedup
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_id
- name
- country
- type
- url
-
-
-
-
- 0
- 532
- 45
- John Wiley and Sons
- gb
- former_publisher
- http://www.wiley.com/
-
-
- 1
- 498
- 4
- American Chemical Society
- us
- society_publisher
- http://pubs.acs.org/
-
-
- 3
- 789
- 126
- Acoustical Society of America
- us
- society_publisher
- http://acousticalsociety.org/
-
-
- 4
- 166
- 3291
- Springer
- gb
- commercial_publisher
- https://www.springernature.com/gp/products/jou...
-
-
- 6
- 807
- 3291
- Springer
- gb
- commercial_publisher
- https://www.springernature.com/gp/products/jou...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1235
- 870
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 1236
- 41
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 1238
- 80
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 1240
- 533
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 1242
- 608
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
-
-
808 rows × 6 columns
-
-
-
-
-
-```python
-sherpa_match_issn
-```
-
-
-
-
-
-
-
-
-
-
- issn
- sherpa_match
-
-
-
-
- 0
- 0001-2815
- OK
-
-
- 1
- 1399-0039
- missing
-
-
- 2
- 0001-4842
- OK
-
-
- 3
- 1520-4898
- OK
-
-
- 4
- 0001-4966
- OK
-
-
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- OK
-
-
- 1756
- 2470-0053
- OK
-
-
- 1757
- 2475-9953
- OK
-
-
- 1758
- 2504-4427
- empty
-
-
- 1759
- 2504-4435
- empty
-
-
-
-
1760 rows × 2 columns
-
-
-
-
-
-```python
-# ajout du issnl et du titre
-sherpa_match_issn = pd.merge(sherpa_match_issn, issn_ids, on='issn', how='left')
-sherpa_match_issn = pd.merge(sherpa_match_issn, journal[['issnl', 'title']], on='issnl', how='left')
-sherpa_match_issn
-```
-
-
-
-
-
-
-
-
-
-
- issn
- sherpa_match
- id
- issnl
- journal
- title
-
-
-
-
- 0
- 0001-2815
- OK
- 1
- 0001-2815
- 532
- Tissue antigens
-
-
- 1
- 1399-0039
- missing
- 2
- 0001-2815
- 532
- Tissue antigens
-
-
- 2
- 0001-4842
- OK
- 3
- 0001-4842
- 498
- Accounts of chemical research
-
-
- 3
- 1520-4898
- OK
- 4
- 0001-4842
- 498
- Accounts of chemical research
-
-
- 4
- 0001-4966
- OK
- 5
- 0001-4966
- 789
- The Journal of the Acoustical Society of America
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- OK
- 1756
- 2470-0045
- 533
- Physical review. E (Print)
-
-
- 1756
- 2470-0053
- OK
- 1757
- 2470-0045
- 533
- Physical review. E (Print)
-
-
- 1757
- 2475-9953
- OK
- 1758
- 2475-9953
- 608
- Physical review materials
-
-
- 1758
- 2504-4427
- empty
- 1759
- 2504-4427
- 994
- GG@G (Print)
-
-
- 1759
- 2504-4435
- empty
- 1760
- 2504-4427
- 994
- GG@G (Print)
-
-
-
-
1760 rows × 6 columns
-
-
-
-
-
-```python
-sherpa_match_results = sherpa_match_issn[['id', 'issnl', 'sherpa_match']].groupby(['issnl', 'sherpa_match']).count()
-sherpa_match_results
-```
-
-
-
-
-
-
-
-
-
-
-
- id
-
-
- issnl
- sherpa_match
-
-
-
-
-
- 0001-2815
- OK
- 1
-
-
- missing
- 1
-
-
- 0001-4842
- OK
- 2
-
-
- 0001-4966
- OK
- 1
-
-
- empty
- 1
-
-
- ...
- ...
- ...
-
-
- 2469-9950
- OK
- 2
-
-
- 2470-0010
- OK
- 2
-
-
- 2470-0045
- OK
- 2
-
-
- 2475-9953
- OK
- 1
-
-
- 2504-4427
- empty
- 2
-
-
-
-
1302 rows × 1 columns
-
-
-
-
-
-```python
-sherpa_match_results = sherpa_match_results.reset_index()
-sherpa_match_results
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- sherpa_match
- id
-
-
-
-
- 0
- 0001-2815
- OK
- 1
-
-
- 1
- 0001-2815
- missing
- 1
-
-
- 2
- 0001-4842
- OK
- 2
-
-
- 3
- 0001-4966
- OK
- 1
-
-
- 4
- 0001-4966
- empty
- 1
-
-
- ...
- ...
- ...
- ...
-
-
- 1297
- 2469-9950
- OK
- 2
-
-
- 1298
- 2470-0010
- OK
- 2
-
-
- 1299
- 2470-0045
- OK
- 2
-
-
- 1300
- 2475-9953
- OK
- 1
-
-
- 1301
- 2504-4427
- empty
- 2
-
-
-
-
1302 rows × 3 columns
-
-
-
-
-
-```python
-sherpa_match_results_ok = sherpa_match_results.loc[sherpa_match_results['sherpa_match'] == 'OK']
-issn_ids_issnl = issn_ids[['issnl', 'journal']].drop_duplicates(subset='issnl')
-issn_ids_issnl = pd.merge(issn_ids_issnl, sherpa_match_results_ok, on='issnl', how='left')
-issn_ids_issnl = pd.merge(issn_ids_issnl, journal[['issnl', 'title']], on='issnl', how='left')
-issn_ids_issnl
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- journal
- sherpa_match
- id
- title
-
-
-
-
- 0
- 0001-2815
- 532
- OK
- 1.0
- Tissue antigens
-
-
- 1
- 0001-4842
- 498
- OK
- 2.0
- Accounts of chemical research
-
-
- 2
- 0001-4966
- 789
- OK
- 1.0
- The Journal of the Acoustical Society of America
-
-
- 3
- 0001-6268
- 166
- OK
- 2.0
- Acta neurochirurgica
-
-
- 4
- 0001-6322
- 807
- OK
- 2.0
- Acta neuropathologica
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 904
- 2469-9950
- 41
- OK
- 2.0
- Physical review. B
-
-
- 905
- 2470-0010
- 80
- OK
- 2.0
- Physical review. D
-
-
- 906
- 2470-0045
- 533
- OK
- 2.0
- Physical review. E (Print)
-
-
- 907
- 2475-9953
- 608
- OK
- 1.0
- Physical review materials
-
-
- 908
- 2504-4427
- 994
- NaN
- NaN
- GG@G (Print)
-
-
-
-
909 rows × 5 columns
-
-
-
-
-
-```python
-journals_not_sherpa = issn_ids_issnl.loc[issn_ids_issnl['sherpa_match'].isna()]
-journals_not_sherpa
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- journal
- sherpa_match
- id
- title
-
-
-
-
- 24
- 0003-6935
- 398
- NaN
- NaN
- Applied optics
-
-
- 27
- 0003-9926
- 605
- NaN
- NaN
- Archives of internal medicine (1960)
-
-
- 28
- 0003-9942
- 974
- NaN
- NaN
- Archives of neurology (Chicago)
-
-
- 47
- 0007-4403
- 885
- NaN
- NaN
- Bulletin de psychologie
-
-
- 48
- 0008-042X
- 180
- NaN
- NaN
- Cahiers pédagogiques (Revue)
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 889
- 2264-7228
- 503
- NaN
- NaN
- Distances et médiations des savoirs
-
-
- 892
- 2297-0703
- 989
- NaN
- NaN
- Schweizer Krebs-Bulletin
-
-
- 893
- 2297-6981
- 618
- NaN
- NaN
- Swiss archives of neurology, psychiatry and ps...
-
-
- 898
- 2352-1791
- 639
- NaN
- NaN
- Nuclear materials and energy
-
-
- 908
- 2504-4427
- 994
- NaN
- NaN
- GG@G (Print)
-
-
-
-
101 rows × 5 columns
-
-
-
-
-
-```python
-sherpa_match_results_empty = sherpa_match_results.loc[sherpa_match_results['sherpa_match'] == 'empty']
-sherpa_match_results_missing = sherpa_match_results.loc[sherpa_match_results['sherpa_match'] == 'missing']
-del journals_not_sherpa['sherpa_match']
-del journals_not_sherpa['id']
-journals_not_sherpa = pd.merge(journals_not_sherpa, sherpa_match_results_empty, on='issnl', how='left')
-del journals_not_sherpa['id']
-journals_not_sherpa = pd.merge(journals_not_sherpa, sherpa_match_results_missing, on='issnl', how='left')
-del journals_not_sherpa['id']
-journals_not_sherpa
-```
-
-
-
-
-
-
-
-
-
-
- issnl
- journal
- title
- sherpa_match_x
- sherpa_match_y
-
-
-
-
- 0
- 0003-6935
- 398
- Applied optics
- empty
- NaN
-
-
- 1
- 0003-9926
- 605
- Archives of internal medicine (1960)
- empty
- missing
-
-
- 2
- 0003-9942
- 974
- Archives of neurology (Chicago)
- empty
- missing
-
-
- 3
- 0007-4403
- 885
- Bulletin de psychologie
- empty
- missing
-
-
- 4
- 0008-042X
- 180
- Cahiers pédagogiques (Revue)
- empty
- missing
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 96
- 2264-7228
- 503
- Distances et médiations des savoirs
- empty
- NaN
-
-
- 97
- 2297-0703
- 989
- Schweizer Krebs-Bulletin
- empty
- NaN
-
-
- 98
- 2297-6981
- 618
- Swiss archives of neurology, psychiatry and ps...
- empty
- missing
-
-
- 99
- 2352-1791
- 639
- Nuclear materials and energy
- empty
- NaN
-
-
- 100
- 2504-4427
- 994
- GG@G (Print)
- empty
- NaN
-
-
-
-
101 rows × 5 columns
-
-
-
-
-
-```python
-# extraction des informations des journaux à partir des données Sherpa/Romeo
-for index, row in issn.iterrows():
- journal_id = row['journal']
- journal_issn = row['issn']
- # boucle des fichiers json
- # test d'existance du fichier
- # print(row['format'])
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- if os.path.exists('sherpa/data/' + journal_issn + '.json'):
- with open('sherpa/data/' + journal_issn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- title = np.nan
- url = np.nan
- if (len(data['items']) > 0):
- if ('url' in data['items'][0]):
- url = data['items'][0]['url']
- if ('title' in data['items'][0]['title'][0]):
- title = data['items'][0]['title'][0]['title']
- sherpa_journal = sherpa_journal.append({'journal' : journal_id, 'title' : title, 'url' : url}, ignore_index=True)
-```
-
- 0
- 10
- 20
- 30
- 40
- 50
- 60
- 70
- 80
- 90
- 100
- 110
- 120
- 130
- 140
- 150
- 160
- 170
- 180
- 190
- 200
- 210
- 220
- 230
- 240
- 250
- 260
- 270
- 280
- 290
- 300
- 310
- 320
- 330
- 340
- 350
- 360
- 370
- 380
- 390
- 400
- 410
- 420
- 430
- 440
- 450
- 460
- 470
- 480
- 490
- 500
- 510
- 520
- 530
- 540
- 550
- 560
- 570
- 580
- 590
- 600
- 610
- 620
- 630
- 640
- 650
- 660
- 670
- 680
- 690
- 700
- 710
- 720
- 730
- 740
- 750
- 760
- 770
- 780
- 790
- 800
- 810
- 820
- 830
- 840
- 850
- 860
- 870
- 880
- 890
- 900
- 910
- 920
- 930
- 940
- 950
- 960
- 970
- 980
- 990
- 1000
- 1010
- 1020
- 1030
- 1040
- 1050
- 1060
- 1070
- 1080
- 1090
- 1100
- 1110
- 1120
- 1130
- 1140
- 1150
- 1160
- 1170
- 1180
- 1190
- 1200
- 1210
- 1220
- 1230
- 1240
- 1250
- 1260
- 1270
- 1280
- 1290
- 1300
- 1310
- 1320
- 1330
- 1340
- 1350
- 1360
- 1370
- 1380
- 1390
- 1400
- 1410
- 1420
- 1430
- 1440
- 1450
- 1460
- 1470
- 1480
- 1490
- 1500
- 1510
- 1520
- 1530
- 1540
- 1550
- 1560
- 1570
- 1580
- 1590
- 1600
- 1610
- 1620
- 1630
- 1640
- 1650
- 1660
- 1670
- 1680
- 1690
- 1700
- 1710
- 1720
- 1730
- 1740
- 1750
-
-
-
-```python
-sherpa_journal
-```
-
-
-
-
-
-
-
-
-
-
- journal
- title
- url
-
-
-
-
- 0
- 532
- Tissue Antigens
- http://onlinelibrary.wiley.com/journal/10.1111...
-
-
- 1
- 498
- Accounts of Chemical Research
- http://pubs.acs.org/journal/achre4
-
-
- 2
- 498
- Accounts of Chemical Research
- http://pubs.acs.org/journal/achre4
-
-
- 3
- 789
- The Journal of the Acoustical Society of America
- http://asa.scitation.org/journal/jas
-
-
- 4
- 166
- Acta Neurochirurgica
- http://link.springer.com/journal/701
-
-
- ...
- ...
- ...
- ...
-
-
- 1238
- 80
- Physical Review D
- http://prd.aps.org/
-
-
- 1239
- 80
- Physical Review D
- http://prd.aps.org/
-
-
- 1240
- 533
- Physical Review E
- http://journals.aps.org/pre/abstract/10.1103/P...
-
-
- 1241
- 533
- Physical Review E
- http://journals.aps.org/pre/abstract/10.1103/P...
-
-
- 1242
- 608
- Physical Review Materials
- http://journals.aps.org/prmaterials/
-
-
-
-
1243 rows × 3 columns
-
-
-
-
-
-```python
-# extraction des informations à partir des données Sherpa/Romeo
-for index, row in issn.iterrows():
- journal_id = row['journal']
- journal_issn = row['issn']
- # boucle des fichiers json
- # test d'existance du fichier
- # print(row['format'])
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- if os.path.exists('sherpa/data/' + journal_issn + '.json'):
- with open('sherpa/data/' + journal_issn + '.json', 'r', encoding='utf-8') as f:
- myissn = np.nan
- mytype = np.nan
- data = json.load(f)
- if (len(data['items']) > 0):
- if ('issns' in data['items'][0]):
- issns = data['items'][0]['issns']
- for i in issns:
- if ('issn' in i):
- myissn = i['issn']
- if ('type' in i):
- mytype = i['type']
- sherpa_issn = sherpa_issn.append({'issn' : myissn, 'type' : mytype}, ignore_index=True)
-```
-
- 0
- 10
- 20
- 30
- 40
- 50
- 60
- 70
- 80
- 90
- 100
- 110
- 120
- 130
- 140
- 150
- 160
- 170
- 180
- 190
- 200
- 210
- 220
- 230
- 240
- 250
- 260
- 270
- 280
- 290
- 300
- 310
- 320
- 330
- 340
- 350
- 360
- 370
- 380
- 390
- 400
- 410
- 420
- 430
- 440
- 450
- 460
- 470
- 480
- 490
- 500
- 510
- 520
- 530
- 540
- 550
- 560
- 570
- 580
- 590
- 600
- 610
- 620
- 630
- 640
- 650
- 660
- 670
- 680
- 690
- 700
- 710
- 720
- 730
- 740
- 750
- 760
- 770
- 780
- 790
- 800
- 810
- 820
- 830
- 840
- 850
- 860
- 870
- 880
- 890
- 900
- 910
- 920
- 930
- 940
- 950
- 960
- 970
- 980
- 990
- 1000
- 1010
- 1020
- 1030
- 1040
- 1050
- 1060
- 1070
- 1080
- 1090
- 1100
- 1110
- 1120
- 1130
- 1140
- 1150
- 1160
- 1170
- 1180
- 1190
- 1200
- 1210
- 1220
- 1230
- 1240
- 1250
- 1260
- 1270
- 1280
- 1290
- 1300
- 1310
- 1320
- 1330
- 1340
- 1350
- 1360
- 1370
- 1380
- 1390
- 1400
- 1410
- 1420
- 1430
- 1440
- 1450
- 1460
- 1470
- 1480
- 1490
- 1500
- 1510
- 1520
- 1530
- 1540
- 1550
- 1560
- 1570
- 1580
- 1590
- 1600
- 1610
- 1620
- 1630
- 1640
- 1650
- 1660
- 1670
- 1680
- 1690
- 1700
- 1710
- 1720
- 1730
- 1740
- 1750
-
-
-
-```python
-sherpa_issn
-```
-
-
-
-
-
-
-
-
-
-
- issn
- type
-
-
-
-
- 0
- 0001-2815
- print
-
-
- 1
- 1399-0039
- electronic
-
-
- 2
- 0001-4842
- print
-
-
- 3
- 1520-4898
- electronic
-
-
- 4
- 0001-4842
- print
-
-
- ...
- ...
- ...
-
-
- 2196
- 2470-0045
- print
-
-
- 2197
- 2470-0053
- electronic
-
-
- 2198
- 2470-0045
- print
-
-
- 2199
- 2470-0053
- electronic
-
-
- 2200
- 2475-9953
- electronic
-
-
-
-
2201 rows × 2 columns
-
-
-
-
-
-```python
-# dedup
-sherpa_issn = sherpa_issn.drop_duplicates()
-sherpa_issn
-```
-
-
-
-
-
-
-
-
-
-
- issn
- type
-
-
-
-
- 0
- 0001-2815
- print
-
-
- 1
- 1399-0039
- electronic
-
-
- 2
- 0001-4842
- print
-
-
- 3
- 1520-4898
- electronic
-
-
- 6
- 0001-4966
- print
-
-
- ...
- ...
- ...
-
-
- 2192
- 2470-0010
- print
-
-
- 2193
- 2470-0029
- electronic
-
-
- 2196
- 2470-0045
- print
-
-
- 2197
- 2470-0053
- electronic
-
-
- 2200
- 2475-9953
- electronic
-
-
-
-
1333 rows × 2 columns
-
-
-
-
-
-```python
-# completer le fichier des issns avec les types de sherpa
-issn2 = pd.merge(issn, sherpa_issn, on='issn', how='left')
-issn2
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
- print
-
-
- 1
- 1399-0039
- 0001-2815
- 532
- NaN
- 3
- 2
- electronic
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
- print
-
-
- 3
- 1520-4898
- 0001-4842
- 498
- NaN
- 3
- 4
- electronic
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
- print
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
- print
-
-
- 1756
- 2470-0053
- 2470-0045
- 533
- NaN
- 3
- 1757
- electronic
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
- electronic
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
- NaN
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
- NaN
-
-
-
-
1760 rows × 7 columns
-
-
-
-
-
-```python
-# exports csv
-publisher_sherpa_dedup.to_csv('sample/publisher_sherpa.tsv', sep='\t', encoding='utf-8', index=False)
-sherpa_match_issn.to_csv('sample/sherpa_match_issn.tsv', sep='\t', encoding='utf-8', index=False)
-sherpa_journal.to_csv('sample/sherpa_journal.tsv', sep='\t', encoding='utf-8', index=False)
-issn2.to_csv('sample/issn_sherpa.tsv', sep='\t', encoding='utf-8', index=False)
-journals_not_sherpa.to_csv('sample/journals_not_sherpa.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# exports excel
-publisher_sherpa_dedup.to_excel('sample/publisher_sherpa.xlsx', index=False)
-sherpa_match_issn.to_excel('sample/sherpa_match_issn.xlsx', index=False)
-sherpa_journal.to_excel('sample/sherpa_journal.xlsx', index=False)
-issn2.to_excel('sample/issn_sherpa.xlsx', index=False)
-journals_not_sherpa.to_excel('sample/journals_not_sherpa.xlsx', index=False)
-```
-
-
-```python
-# ajout des titres Sherpa a la table des revues
-# renommer les colonnes
-sherpa_journal = sherpa_journal.rename(columns={'journal' : 'id'})
-journal = pd.merge(journal, sherpa_journal, on='id', how='left')
-journal
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title_x
- starting_year
- end_year
- url_x
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
- doublon_issnl
- oa_status
- publisher
- title_y
- url_y
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 1
- NaN
- NaN
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
- 2
- Physical Review Letters
- http://prl.aps.org/
-
-
- 2
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
- 2
- Physical Review Letters
- http://prl.aps.org/
-
-
- 3
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1.0
- Yes
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
- 0.0
- 4.035714
- NaN
- 5
- 3
- PLoS ONE
- http://www.plosone.org/
-
-
- 4
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 4, 5
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1341
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 75
- Journal of Pediatric Surgery
- http://www.jpedsurg.org/
-
-
- 1342
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
- NaN
- 1
- 8
- Probability Theory and Related Fields
- http://www.springerlink.com/content/100451/?p=...
-
-
- 1343
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
- NaN
- 1
- 8
- Probability Theory and Related Fields
- http://www.springerlink.com/content/100451/?p=...
-
-
- 1344
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 119
- Renewable Energy
- http://www.elsevier.com/wps/product/cws_home/9...
-
-
- 1345
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 217
- NaN
- NaN
-
-
-
-
1346 rows × 26 columns
-
-
-
-
-
-```python
-# choix du titre et url
-journal['url'] = journal['url_y']
-journal.loc[journal['url_y'].isna(), 'url'] = journal['url_x']
-journal['title'] = journal['title_y']
-journal.loc[journal['title_y'].isna(), 'title'] = journal['title_x']
-journal
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- issnl
- title_x
- starting_year
- end_year
- url_x
- name_short_iso_4
- language
- country
- doaj_title
- doaj_seal
- APC
- doaj_status
- lockss_title
- lockss
- portico_status
- portico
- nlch_title
- nlch
- qoam_av_score
- doublon_issnl
- oa_status
- publisher
- title_y
- url_y
- url
- title
-
-
-
-
- 0
- 1
- 1660-9379
- 1660-9379
- Revue médicale suisse
- 2005
- 9999
- NaN
- Rev. méd. suisse
- 138
- 215
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 1
- NaN
- NaN
- NaN
- Revue médicale suisse
-
-
- 1
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
- 2
- Physical Review Letters
- http://prl.aps.org/
- http://prl.aps.org/
- Physical Review Letters
-
-
- 2
- 2
- 0031-9007
- 0031-9007
- Physical review letters (Print)
- 1958
- 9999
- http://prl.aps.org/
- Phys. rev. lett. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- 1.0
- 1
- 2
- Physical Review Letters
- http://prl.aps.org/
- http://prl.aps.org/
- Physical Review Letters
-
-
- 3
- 3
- 1932-6203
- 1932-6203
- PloS one
- 2006
- 9999
- http://www.plosone.org/
- NaN
- 124
- 236
- PLoS ONE
- 1.0
- Yes
- 1.0
- PLoS One
- 1.0
- NaN
- 0.0
- NaN
- 0.0
- 4.035714
- NaN
- 5
- 3
- PLoS ONE
- http://www.plosone.org/
- http://www.plosone.org/
- PLoS ONE
-
-
- 4
- 4
- 2174-8454
- 2174-8454
- EU-topías
- 2011
- 9999
- NaN
- EU-topías
- 124, 138, 402, 292
- 209
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 4, 5
- NaN
- NaN
- NaN
- EU-topías
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1341
- 998
- 0022-3468
- 0022-3468
- Journal of pediatric surgery (Print)
- 1966
- 9999
- http://www.jpedsurg.org
- J. pediatr. surg. (Print)
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 75
- Journal of Pediatric Surgery
- http://www.jpedsurg.org/
- http://www.jpedsurg.org/
- Journal of Pediatric Surgery
-
-
- 1342
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
- NaN
- 1
- 8
- Probability Theory and Related Fields
- http://www.springerlink.com/content/100451/?p=...
- http://www.springerlink.com/content/100451/?p=...
- Probability Theory and Related Fields
-
-
- 1343
- 999
- 1432-2064
- 0178-8051
- Probability theory and related fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451
- Probab. theory relat. fields (Internet)
- 124
- 83
- NaN
- NaN
- NaN
- 0.0
- Probability Theory and Related Fields
- 1.0
- preserved
- 1.0
- Probability Theory and Related Fields
- 1.0
- NaN
- NaN
- 1
- 8
- Probability Theory and Related Fields
- http://www.springerlink.com/content/100451/?p=...
- http://www.springerlink.com/content/100451/?p=...
- Probability Theory and Related Fields
-
-
- 1344
- 1000
- 0960-1481
- 0960-1481
- Renewable energy
- 1991
- 9999
- NaN
- Renew. energy
- 124
- 234
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- preserved
- 1.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 119
- Renewable Energy
- http://www.elsevier.com/wps/product/cws_home/9...
- http://www.elsevier.com/wps/product/cws_home/9...
- Renewable Energy
-
-
- 1345
- 1001
- 0161-7567
- 0161-7567
- Journal of applied physiology: respiratory, en...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- J. appl. physiol.: respir., environ. exercise ...
- 124
- 236
- NaN
- NaN
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- 0.0
- NaN
- NaN
- 1
- 217
- NaN
- NaN
- https://www.physiology.org/journal/jappl
- Journal of applied physiology: respiratory, en...
-
-
-
-
1346 rows × 28 columns
-
-
-
-
-
-```python
-journals_export = journal[['id', 'title', 'name_short_iso_4', 'starting_year', 'end_year', 'url', 'country', 'language', 'oa_status', 'publisher', 'doaj_seal', 'doaj_status', 'lockss', 'portico', 'nlch', 'qoam_av_score']]
-journals_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- title
- name_short_iso_4
- starting_year
- end_year
- url
- country
- language
- oa_status
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
- NaN
- 215
- 138
- 1
- 1
- NaN
- 0.0
- 0.0
- 0.0
- 0.0
- NaN
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 1
- 2
- NaN
- 0.0
- 0.0
- 1.0
- 0.0
- NaN
-
-
- 2
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 1
- 2
- NaN
- 0.0
- 0.0
- 1.0
- 0.0
- NaN
-
-
- 3
- 3
- PLoS ONE
- NaN
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 5
- 3
- 1.0
- 1.0
- 1.0
- 0.0
- 0.0
- 4.035714
-
-
- 4
- 4
- EU-topías
- EU-topías
- 2011
- 9999
- NaN
- 209
- 124, 138, 402, 292
- 1
- 4, 5
- NaN
- 0.0
- 0.0
- 0.0
- 0.0
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1341
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 1
- 75
- NaN
- 0.0
- 0.0
- 1.0
- 0.0
- NaN
-
-
- 1342
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 1
- 8
- NaN
- 0.0
- 1.0
- 1.0
- 1.0
- NaN
-
-
- 1343
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 1
- 8
- NaN
- 0.0
- 1.0
- 1.0
- 1.0
- NaN
-
-
- 1344
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 1
- 119
- NaN
- 0.0
- 0.0
- 1.0
- 0.0
- NaN
-
-
- 1345
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 1
- 217
- NaN
- 0.0
- 0.0
- 0.0
- 0.0
- NaN
-
-
-
-
1346 rows × 16 columns
-
-
-
-
-
-```python
-# renommage des champs finaux
-journals_export = journals_export.rename(columns={'title' : 'name', 'url' : 'website'})
-# remplacement des vides et id à int
-journals_export['starting_year'] = journals_export['starting_year'].fillna(0)
-journals_export['end_year'] = journals_export['end_year'].fillna(9999)
-journals_export['name_short_iso_4'] = journals_export['name_short_iso_4'].fillna('')
-journals_export['website'] = journals_export['website'].fillna('')
-journals_export['doaj_seal'] = journals_export['doaj_seal'].fillna('0')
-journals_export['country'] = journals_export['country'].fillna('999999')
-journals_export['language'] = journals_export['language'].fillna('999999')
-journals_export['doaj_status'] = journals_export['doaj_status'].astype(int)
-journals_export['doaj_seal'] = journals_export['doaj_seal'].astype(int)
-journals_export['lockss'] = journals_export['lockss'].astype(int)
-journals_export['portico'] = journals_export['portico'].astype(int)
-journals_export['nlch'] = journals_export['nlch'].astype(int)
-journals_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- oa_status
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
-
- 215
- 138
- 1
- 1
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 1
- 2
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 2
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 1
- 2
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 3
- 3
- PLoS ONE
-
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 5
- 3
- 1
- 1
- 1
- 0
- 0
- 4.035714
-
-
- 4
- 4
- EU-topías
- EU-topías
- 2011
- 9999
-
- 209
- 124, 138, 402, 292
- 1
- 4, 5
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1341
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 1
- 75
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 1342
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 1
- 8
- 0
- 0
- 1
- 1
- 1
- NaN
-
-
- 1343
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 1
- 8
- 0
- 0
- 1
- 1
- 1
- NaN
-
-
- 1344
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 1
- 119
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 1345
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 1
- 217
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
-
-
1346 rows × 16 columns
-
-
-
-
-
-```python
-journals_export = journals_export.drop_duplicates(subset='id')
-journals_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- oa_status
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
-
- 215
- 138
- 1
- 1
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 1
- 2
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 3
- 3
- PLoS ONE
-
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 5
- 3
- 1
- 1
- 1
- 0
- 0
- 4.035714
-
-
- 4
- 4
- EU-topías
- EU-topías
- 2011
- 9999
-
- 209
- 124, 138, 402, 292
- 1
- 4, 5
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 5
- 5
- Physical review B: Condensed matter and materi...
- Phys. rev., B, Condens. matter mater. phys.
- 1998
- 2015
- http://journals.aps.org/prb/
- 236
- 124
- 1
- 6
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1339
- 997
- Smart Materials and Structures
- Smart mater. struct. (Print)
- 1992
- 9999
- http://iopscience.iop.org/0964-1726
- 234
- 124
- 1
- 47
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 1341
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 1
- 75
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 1342
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 1
- 8
- 0
- 0
- 1
- 1
- 1
- NaN
-
-
- 1344
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 1
- 119
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 1345
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 1
- 217
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
-
-
911 rows × 16 columns
-
-
-
-
-
-```python
-# test journaux sans titre
-journals_export.loc[journals_export['name'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- oa_status
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
-
-
-
-
-
-
-
-
-
-
-```python
-# export et suppression des journaux sans titre
-# export csv
-journals_export.loc[journals_export['name'].isna()].to_csv('sample/sherpa_journals_without_title.tsv', sep='\t', encoding='utf-8', index=False)
-# export excel
-journals_export.loc[journals_export['name'].isna()].to_excel('sample/sherpa_journals_without_title.xlsx', index=False)
-journals_export = journals_export.loc[journals_export['name'].notna()]
-journals_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- oa_status
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
-
- 215
- 138
- 1
- 1
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 1
- 2
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 3
- 3
- PLoS ONE
-
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 5
- 3
- 1
- 1
- 1
- 0
- 0
- 4.035714
-
-
- 4
- 4
- EU-topías
- EU-topías
- 2011
- 9999
-
- 209
- 124, 138, 402, 292
- 1
- 4, 5
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 5
- 5
- Physical review B: Condensed matter and materi...
- Phys. rev., B, Condens. matter mater. phys.
- 1998
- 2015
- http://journals.aps.org/prb/
- 236
- 124
- 1
- 6
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1339
- 997
- Smart Materials and Structures
- Smart mater. struct. (Print)
- 1992
- 9999
- http://iopscience.iop.org/0964-1726
- 234
- 124
- 1
- 47
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 1341
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 1
- 75
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 1342
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 1
- 8
- 0
- 0
- 1
- 1
- 1
- NaN
-
-
- 1344
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 1
- 119
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 1345
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 1
- 217
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
-
-
911 rows × 16 columns
-
-
-
-
-
-```python
-journals_export.loc[journals_export['name'].str.contains('(Print)')]
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\strings.py:1843: UserWarning: This pattern has match groups. To actually get the groups, use str.extract.
- return func(self, *args, **kwargs)
-
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- oa_status
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
-
-
-
-
- 86
- 54
- Helvetica physica acta (Print)
- Helv. phys. acta (Print)
- 1928
- 1999
-
- 215
- 124, 138, 151
- 1
- 41
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 239
- 155
- Studies in health technology and informatics (...
- Stud. health technol. inform. (Print)
- 1991
- 9999
-
- 156
- 124
- 1
- 90
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 441
- 306
- Bioethica Forum (Basel. 2008. Print)
- Bioeth. Forum (Basel, 2008, Print)
- 2008
- 9999
-
- 215
- 138, 124, 151
- 1
- 143
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 534
- 373
- Schweizerische Ärztezeitung (Print)
- Schweiz. Ärzteztg. (Print)
- 1952
- 9999
-
- 215
- 203, 151, 138
- 1
- 170
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 601
- 430
- The European physical journal. B, Condensed ma...
- Eur. phys. j., B Cond. matter phys. (Print)
- 1998
- 9999
-
- 76
- 124
- 1
- 195, 43
- 0
- 0
- 1
- 1
- 1
- 1.25
-
-
- 650
- 467
- Conference on Lasers and Electro-optics (Print)
- Conf. Lasers Electro-opt. (Print)
- 2003
- 9999
- http://www.cleoconference.org/
- 236
- 124
- 1
- 39
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 850
- 618
- Swiss archives of neurology, psychiatry and ps...
- Swiss arch. neurol. psychiatry psychother. (Pr...
- 2016
- 9999
-
- 215
- 151, 124, 138
- 6
- 20
- 0
- 1
- 0
- 0
- 0
- NaN
-
-
- 901
- 660
- Journal der Deutschen Dermatologischen Gesells...
-
- 2003
- 9999
-
- 234
- 151, 124
- 1
- 283
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 957
- 702
- IEEE/LEOS International Conference on Optical ...
- IEEE/LEOS Int. Conf. Opt. MEMS Nanophotonics (...
- 2007
- 20uu
- http://ieeexplore.ieee.org/xpl/conhome.jsp?pun...
- 236
- 124
- 1
- 280
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 1104
- 814
- Forumpoenale (Print)
- Forumpoenale (Print)
- 2008
- 9999
-
- 215
- 151, 203, 138
- 1
- 204
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 1182
- 877
- Gesnerus (Print)
- Gesnerus (Print)
- 1943
- 9999
-
- 215
- 124, 138, 151, 203
- 1
- 143
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 1336
- 994
- GG@G (Print)
- GG@G (Print)
- 2000
- 9999
-
- 215
- 124
- 1
- 380
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
-
-
-
-
-
-
-```python
-journals_export.loc[journals_export['name'].str.contains('(Online)')]
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- oa_status
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
-
-
-
-
- 1257
- 936
- Plastic and reconstructive surgery (Online)
- Plast. reconstr. surg. (Online)
- 1963
- 9999
- http://gateway.ovid.com/ovidweb.cgi?T=JS&MODE=...
- 236
- 124
- 1
- 363
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
-
-
-
-
-
-
-```python
-# remplacement des mentions " (Print)" et " (Online)" dans les titres
-journals_export['name'] = journals_export['name'].str.replace('(Print)', '')
-journals_export['name'] = journals_export['name'].str.replace('(Online)', '')
-journals_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- oa_status
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
-
- 215
- 138
- 1
- 1
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 1
- 2
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 3
- 3
- PLoS ONE
-
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 5
- 3
- 1
- 1
- 1
- 0
- 0
- 4.035714
-
-
- 4
- 4
- EU-topías
- EU-topías
- 2011
- 9999
-
- 209
- 124, 138, 402, 292
- 1
- 4, 5
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
- 5
- 5
- Physical review B: Condensed matter and materi...
- Phys. rev., B, Condens. matter mater. phys.
- 1998
- 2015
- http://journals.aps.org/prb/
- 236
- 124
- 1
- 6
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1339
- 997
- Smart Materials and Structures
- Smart mater. struct. (Print)
- 1992
- 9999
- http://iopscience.iop.org/0964-1726
- 234
- 124
- 1
- 47
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 1341
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 1
- 75
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 1342
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 1
- 8
- 0
- 0
- 1
- 1
- 1
- NaN
-
-
- 1344
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 1
- 119
- 0
- 0
- 0
- 1
- 0
- NaN
-
-
- 1345
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 1
- 217
- 0
- 0
- 0
- 0
- 0
- NaN
-
-
-
-
911 rows × 16 columns
-
-
-
-
-
-```python
-journals_export.loc[journals_export['name'].str.contains('(Print)')]
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- oa_status
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
-
-
-
-
-
-
-
-
-
-
-```python
-journals_export.loc[journals_export['name'].str.contains('(Online)')]
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- oa_status
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
-
-
-
-
-
-
-
-
-
-## Table sherpa_policies
-
-
-```python
-# creation du DF
-col_names = ['journal',
- 'issn',
- 'sherpa_id',
- 'sherpa_uri',
- 'open_access_prohibited',
- 'additional_oa_fee',
- 'article_version',
- 'license',
- 'embargo',
- 'prerequisites',
- 'prerequisite_funders',
- 'prerequisite_funders_name',
- 'prerequisite_funders_fundref',
- 'prerequisite_funders_ror',
- 'prerequisite_funders_country',
- 'prerequisite_funders_url',
- 'prerequisite_funders_sherpa_id',
- 'prerequisite_subjects',
- 'location',
- 'locations_ir',
- 'locations_not_ir',
- 'named_repository',
- 'named_academic_social_network',
- 'copyright_owner',
- 'publisher_deposit',
- 'archiving',
- 'conditions',
- 'public_notes'
- ]
-sherpa_policies = pd.DataFrame(columns = col_names)
-sherpa_policies
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- license
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
-
-
-
-
-
-
-
-
-
-
-```python
-# dédoublonage par journal id
-issn_dedup = issn.drop_duplicates(subset='journal')
-issn_dedup
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
-
-
- 7
- 0001-6268
- 0001-6268
- 166
- PRINT
- 1
- 8
-
-
- 9
- 0001-6322
- 0001-6322
- 807
- PRINT
- 1
- 10
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1751
- 2469-9950
- 2469-9950
- 41
- PRINT
- 1
- 1752
-
-
- 1753
- 2470-0010
- 2470-0010
- 80
- PRINT
- 1
- 1754
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
-
-
-
-
909 rows × 6 columns
-
-
-
-
-
-```python
-# type de repositories qui provoquent archiving = 1 :
-# tous les types : 'academic_social_network', 'any_repository', 'any_website', 'authors_homepage',
-# 'funder_designated_location', 'institutional_repository', 'institutional_website', 'named_academic_social_network',
-# 'named_repository', 'non_commercial_institutional_repository', 'non_commercial_repository',
-# 'non_commercial_social_network', 'non_commercial_subject_repository', 'non_commercial_website',
-# 'preprint_repository', 'subject_repository', 'this_journal'
-repositories_archiving = ['any_repository',
- 'institutional_repository',
- 'institutional_website',
- 'non_commercial_institutional_repository',
- 'non_commercial_repository',
- 'any_website',
- 'non_commercial_website']
-
-# extraction des termes
-for index, row in issn_dedup.iterrows():
- journal_id = row['journal']
- journal_issn = row['issn']
- # boucle des fichiers json
- # print(row['format'])
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- # test d'existance du fichier
- if os.path.exists('sherpa/data/' + journal_issn + '.json'):
- with open('sherpa/data/' + journal_issn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- # initialisation des variables à extraire
- sherpa_id = np.nan
- sherpa_uri = np.nan
- open_access_prohibited = np.nan
- location = np.nan
- locations_ir = ''
- locations_not_ir = ''
- additional_oa_fee = np.nan
- article_versions = np.nan
- article_version = np.nan
- licenses = []
- embargo = 0
- prerequisites = np.nan
- prerequisite_funders = np.nan
- prerequisite_funders_name = np.nan
- prerequisite_funders_fundref = np.nan
- prerequisite_funders_ror = np.nan
- prerequisite_funders_country = np.nan
- prerequisite_funders_url = np.nan
- prerequisite_funders_sherpa_id = np.nan
- prerequisite_subjects = np.nan
- named_repository = np.nan
- named_academic_social_network = np.nan
- copyright_owner = np.nan
- publisher_deposit = np.nan
- archiving = np.nan
- conditions = np.nan
- public_notes = np.nan
- if (len(data['items']) > 0):
- if ('id' in data['items'][0]):
- sherpa_id = data['items'][0]['id']
- # test si l'id est déjà présent
- if sherpa_id in sherpa_policies['sherpa_id'] :
- print('SKIP ' + str(sherpa_id))
- else :
- poilicies = data['items'][0]['publisher_policy']
- for poilicy in poilicies:
- # initialisation des variables à extraire
- sherpa_uri = np.nan
- open_access_prohibited = np.nan
- if ('uri' in poilicy):
- sherpa_uri = poilicy['uri']
- if ('open_access_prohibited' in poilicy):
- open_access_prohibited = poilicy['open_access_prohibited']
- if ('permitted_oa' in poilicy):
- poas = poilicy['permitted_oa']
- for poa in poas:
- additional_oa_fee = np.nan
- article_versions = np.nan
- article_version = np.nan
- licenses = []
- embargo = 0
- prerequisites = np.nan
- prerequisite_funders = np.nan
- prerequisite_funders_name = np.nan
- prerequisite_funders_fundref = np.nan
- prerequisite_funders_ror = np.nan
- prerequisite_funders_country = np.nan
- prerequisite_funders_url = np.nan
- prerequisite_funders_sherpa_id = np.nan
- prerequisite_subjects = np.nan
- named_repository = np.nan
- named_academic_social_network = np.nan
- locations_ir = ''
- locations_not_ir = ''
- copyright_owner = np.nan
- conditions = np.nan
- public_notes = np.nan
- if ('additional_oa_fee' in poa):
- additional_oa_fee = poa['additional_oa_fee']
- if ('location' in poa):
- archiving = 0
- location = ''
- mylocations = poa['location']['location']
- mylocations_text = poa['location']['location_phrases']
- if (type(mylocations) is not list):
- mylocations = [mylocations]
- location = ' ; '.join(mylocations)
- for locationi in mylocations:
- if locationi in repositories_archiving :
- archiving = archiving + 1
- for locationi_text in mylocations_text:
- if locationi_text['value'] == locationi :
- if locations_ir == '':
- locations_ir = locations_ir + locationi_text['phrase']
- else :
- if locationi_text['phrase'] not in locations_ir :
- locations_ir = locations_ir + ' ; ' + locationi_text['phrase']
- else :
- for locationi_text in mylocations_text:
- if locationi_text['value'] == locationi :
- if locations_not_ir == '':
- locations_not_ir = locations_not_ir + locationi_text['phrase']
- else :
- if locationi_text['phrase'] not in locations_not_ir :
- locations_not_ir = locations_not_ir + ' ; ' + locationi_text['phrase']
- # print (archiving)
- if archiving > 0:
- archiving = True
- else :
- archiving = False
- if ('named_repository' in poa['location']):
- if (type(poa['location']['named_repository']) is list):
- named_repository = ' ; '.join(poa['location']['named_repository'])
- else :
- named_repository = poa['location']['named_repository']
- locations_not_ir = locations_not_ir.replace('Named Repository', named_repository)
- locations_ir = locations_ir.replace('Named Repository', named_repository)
- if ('named_academic_social_network' in poa['location']):
- if (type(poa['location']['named_academic_social_network']) is list):
- named_academic_social_network = ' ; '.join(poa['location']['named_academic_social_network'])
- else :
- named_academic_social_network = poa['location']['named_academic_social_network']
- locations_not_ir = locations_not_ir.replace('Named Academic Social Network', named_academic_social_network)
- locations_ir = locations_ir.replace('Named Academic Social Network', named_academic_social_network)
- if ('embargo' in poa):
- # print(poa['embargo'])
- embargo_amount = 0
- if ('amount' in poa['embargo']):
- embargo_amount = poa['embargo']['amount']
- if ('units' in poa['embargo']):
- if (poa['embargo']['units'] == 'months') :
- embargo = embargo_amount
- elif (poa['embargo']['units'] == 'years') :
- embargo = embargo_amount*12
- elif (poa['embargo']['units'] == 'weeks') :
- embargo = int(embargo_amount/4)
- if (embargo == 0):
- embargo = 1
- elif (poa['embargo']['units'] == 'days') :
- embargo = int(embargo_amount/30)
- if (embargo == 0):
- embargo = 1
- else :
- embargo = embargo_amount
- if ('prerequisites' in poa):
- if 'prerequisites' in poa['prerequisites'] :
- if (type(poa['prerequisites']['prerequisites']) is list):
- prerequisites = ' ; '.join(poa['prerequisites']['prerequisites'])
- else:
- prerequisites = poa['prerequisites']['prerequisites']
- if ('prerequisite_funders' in poa['prerequisites']):
- prerequisite_funders = True
- # prerequisite_funders = poa['prerequisites']['prerequisite_funders']
- # if (type(poa['prerequisites']['prerequisite_funders']) is list):
- # prerequisite_funders = ' ; '.join(poa['prerequisites']['prerequisite_funders'])
- # else:
- # prerequisite_funders = poa['prerequisites']['prerequisite_funders']
- if ('prerequisite_subjects' in poa['prerequisites']):
- prerequisite_subjects = True
- # prerequisite_subjects = poa['prerequisites']['prerequisite_subjects']
- # if (type(poa['prerequisite_subjects']) is list):
- # prerequisite_subjects = ' ; '.join(poa['prerequisite_subjects'])
- # else:
- # prerequisite_subjects = poa['prerequisite_subjects']
- if ('copyright_owner' in poa):
- copyright_owner = poa['copyright_owner']
- if ('publisher_deposit' in poa):
- publisher_deposit = ''
- if (type(poa['publisher_deposit']) is list):
- for deposit in poa['publisher_deposit']:
- if 'type' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + deposit['repository_metadata']['type']
- if 'name' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + ' (' + deposit['repository_metadata']['name'][0]['name'] + ')'
- else :
- if 'name' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + deposit['repository_metadata']['name'][0]['name']
- publisher_deposit = publisher_deposit + ' ; '
- else :
- deposit = poa['publisher_deposit']
- if 'type' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + deposit['repository_metadata']['type']
- if 'name' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + ' (' + deposit['repository_metadata']['name'][0]['name'] + ')'
- else :
- if 'name' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + deposit['repository_metadata']['name'][0]['name']
- publisher_deposit = publisher_deposit + ' ; '
- # print (publisher_deposit)
- if ('conditions' in poa):
- if (type(poa['conditions']) is list):
- conditions = ' ; '.join(poa['conditions'])
- else:
- conditions = poa['conditions']
- if ('public_notes' in poa):
- if (type(poa['public_notes']) is list):
- public_notes = ' ; '.join(poa['public_notes'])
- else:
- public_notes = poa['public_notes']
- if ('license' in poa):
- licenses = poa['license']
- if (type(licenses) is not list):
- licenses = [licenses]
- else :
- licenses = ['']
- # avec article version
- if ('article_version' in poa):
- article_versions = poa['article_version']
- for article_version in article_versions:
- for license in licenses:
- if ('license' in license):
- mylicense = license['license']
- else :
- mylicense = ''
- # avec prerequisites
- if ('prerequisites' in poa) :
- # avec prerequisites_funders
- if ('prerequisite_funders' in poa['prerequisites']):
- for prerequisite_fundersi in poa['prerequisites']['prerequisite_funders'] :
- prerequisite_funders_name = prerequisite_fundersi['funder_metadata']['name'][0]['name']
- if 'acronym' in prerequisite_fundersi['funder_metadata']['name'][0]:
- prerequisite_funders_name = prerequisite_funders_name + ' (' + prerequisite_fundersi['funder_metadata']['name'][0]['acronym'] + ')'
- if 'identifiers' in prerequisite_fundersi['funder_metadata'] :
- for fund_identifier in prerequisite_fundersi['funder_metadata']['identifiers'] :
- if fund_identifier['type'] == 'fundref':
- prerequisite_funders_fundref = fund_identifier['identifier']
- if fund_identifier['type'] == 'ror':
- prerequisite_funders_ror = fund_identifier['identifier']
- if 'country' in prerequisite_fundersi['funder_metadata']:
- prerequisite_funders_country = prerequisite_fundersi['funder_metadata']['country']
- if 'url' in prerequisite_fundersi['funder_metadata']:
- prerequisite_funders_url = prerequisite_fundersi['funder_metadata']['url'][0]['url']
- prerequisite_funders_sherpa_id = prerequisite_fundersi['funder_metadata']['id']
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
- # sans prerequisites_funders
- else :
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
- # sans prerequisites
- else :
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
-
- # sans article version
- else :
- if (type(licenses) is not list):
- licenses = [licenses]
- for license in licenses:
- if ('license' in license):
- mylicense = license['license']
- else :
- mylicense = ''
- # avec prerequisites
- if ('prerequisites' in poa) :
- # avec prerequisites_funders
- if ('prerequisite_funders' in poa['prerequisites']):
- for prerequisite_fundersi in poa['prerequisites']['prerequisite_funders'] :
- prerequisite_funders_name = prerequisite_fundersi['funder_metadata']['name'][0]['name']
- if 'acronym' in prerequisite_fundersi['funder_metadata']['name'][0]:
- prerequisite_funders_name = prerequisite_funders_name + ' (' + prerequisite_fundersi['funder_metadata']['name'][0]['acronym'] + ')'
- if 'identifiers' in prerequisite_fundersi['funder_metadata'] :
- for fund_identifier in prerequisite_fundersi['funder_metadata']['identifiers'] :
- if fund_identifier['type'] == 'fundref':
- prerequisite_funders_fundref = fund_identifier['identifier']
- if fund_identifier['type'] == 'ror':
- prerequisite_funders_ror = fund_identifier['identifier']
- if 'country' in prerequisite_fundersi['funder_metadata']:
- prerequisite_funders_country = prerequisite_fundersi['funder_metadata']['country']
- if 'url' in prerequisite_fundersi['funder_metadata']:
- prerequisite_funders_url = prerequisite_fundersi['funder_metadata']['url'][0]['url']
- prerequisite_funders_sherpa_id = prerequisite_fundersi['funder_metadata']['id']
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
- # sans prerequisites_funders
- else :
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
- # sans prerequisites
- else :
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
- # sans permitted_oa
- else :
- print ('permitted_oa MISSING')
- else :
- print ('id MISSING')
-```
-
- 0
- 20
- 40
- 50
- 60
- SKIP 321
- 110
- SKIP 475
- SKIP 476
- 180
- 220
- 250
- 260
- 290
- 300
- 330
- 340
- 360
- 370
- 380
- 420
- permitted_oa MISSING
- 430
- permitted_oa MISSING
- SKIP 1319
- SKIP 880
- permitted_oa MISSING
- 510
- permitted_oa MISSING
- 530
- 540
- 550
- 560
- SKIP 1342
- 570
- 590
- SKIP 3082
- SKIP 2465
- SKIP 1682
- SKIP 325
- SKIP 3179
- 670
- 680
- SKIP 1641
- SKIP 1202
- 720
- SKIP 3995
- 730
- SKIP 3475
- SKIP 3490
- 740
- 750
- 760
- SKIP 1383
- SKIP 1357
- permitted_oa MISSING
- 830
- 840
- SKIP 1868
- 850
- SKIP 883
- 880
- 890
- SKIP 1392
- 900
- 910
- SKIP 1377
- 920
- SKIP 3443
- 930
- 940
- SKIP 1123
- SKIP 3581
- SKIP 3558
- SKIP 745
- 980
- 990
- SKIP 11
- SKIP 2499
- 1000
- SKIP 42
- 1010
- 1020
- SKIP 314
- 1030
- 1040
- SKIP 1380
- SKIP 229
- SKIP 1518
- SKIP 5682
- SKIP 4708
- SKIP 1661
- 1130
- SKIP 6585
- 1140
- SKIP 3212
- 1150
- SKIP 335
- SKIP 6774
- 1160
- SKIP 6590
- 1180
- SKIP 1639
- SKIP 5094
- SKIP 1254
- 1200
- SKIP 6325
- SKIP 3539
- SKIP 1444
- SKIP 250
- SKIP 1543
- SKIP 3415
- SKIP 3571
- SKIP 3474
- SKIP 3586
- SKIP 3220
- SKIP 3837
- SKIP 1650
- SKIP 1051
- SKIP 3572
- SKIP 612
- SKIP 6587
- SKIP 3567
- SKIP 1654
- SKIP 4070
- SKIP 1643
- SKIP 6588
- SKIP 1657
- SKIP 1687
- SKIP 1692
- SKIP 1341
- 1320
- SKIP 7150
- SKIP 876
- 1330
- SKIP 7007
- SKIP 7091
- 1340
- 1350
- SKIP 173
- SKIP 4703
- 1360
- SKIP 2515
- 1370
- SKIP 242
- SKIP 3930
- SKIP 2004
- 1400
- 1410
- SKIP 2123
- SKIP 1320
- SKIP 1459
- SKIP 1588
- SKIP 7678
- SKIP 1391
- SKIP 878
- SKIP 138
- SKIP 7632
- SKIP 1644
- SKIP 1637
- SKIP 2207
- SKIP 2428
- SKIP 2432
- 1460
- SKIP 2477
- SKIP 2430
- SKIP 1653
- SKIP 2397
- SKIP 5935
- SKIP 3527
- SKIP 148
- SKIP 7793
- SKIP 4005
- SKIP 7768
- SKIP 3455
- SKIP 1652
- SKIP 3570
- SKIP 7792
- SKIP 3533
- SKIP 6586
- 1520
- SKIP 7787
- SKIP 3355
- 1530
- SKIP 226
- SKIP 1655
- SKIP 7783
- 1540
- SKIP 6582
- 1550
- SKIP 7762
- SKIP 4691
- SKIP 1911
- SKIP 1447
- SKIP 1778
- SKIP 1888
- SKIP 228
- SKIP 7407
- SKIP 7965
- 1590
- 1600
- 1610
- SKIP 821
- SKIP 823
- SKIP 7714
- 1620
- SKIP 172
- SKIP 2624
- SKIP 3654
- SKIP 1659
- SKIP 1656
- SKIP 1658
- SKIP 1393
- 1640
- SKIP 6778
- SKIP 8220
- SKIP 7872
- SKIP 1587
- SKIP 822
- SKIP 1460
- SKIP 6581
- SKIP 3568
- 1670
- SKIP 7509
- SKIP 7799
- SKIP 7765
- 1680
- SKIP 7761
- SKIP 7800
- 1690
- SKIP 1244
- 1710
- SKIP 6222
- 1730
- 1740
- 1750
-
-
-
-```python
-sherpa_policies
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- license
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
-
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
-
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
-
-
-
-
8595 rows × 28 columns
-
-
-
-
-
-```python
-# convertir l'index en id
-sherpa_policies = sherpa_policies.reset_index()
-# ajout de l'id avec l'index + 1
-sherpa_policies['id'] = sherpa_policies['index'] + 1
-del sherpa_policies['index']
-sherpa_policies
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- license
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
-
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
-
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
-
-
-
-
8595 rows × 29 columns
-
-
-
-
-
-```python
-# export csv
-sherpa_policies.to_csv('sample/sherpa_policies_brut.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-sherpa_policies.to_excel('sample/sherpa_policies_brut.xlsx', index=False)
-```
-
-## Calcul de la catégorie "green" et export final des journaux
-
-
-```python
-sherpa_policies
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- license
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
-
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
-
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
-
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
-
-
-
-
8595 rows × 29 columns
-
-
-
-
-
-```python
-sherpa_policies_ir = sherpa_policies.loc[(sherpa_policies['archiving'] == True) & (sherpa_policies['article_version'] == 'published') & (sherpa_policies['prerequisite_funders'].isna())][['journal', 'embargo', 'license', 'conditions']]
-sherpa_policies_ir
-```
-
-
-
-
-
-
-
-
-
-
- journal
- embargo
- license
- conditions
-
-
-
-
- 2
- 532
- 0
- cc_by
- Published source must be acknowledged
-
-
- 3
- 532
- 0
- cc_by_nc_nd
- Published source must be acknowledged
-
-
- 9
- 498
- 12
- cc_by
- NaN
-
-
- 10
- 498
- 12
- cc_by_nc_nd
- NaN
-
-
- 11
- 498
- 12
- bespoke_license
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 8588
- 533
- 0
- cc_by
- NaN
-
-
- 8589
- 533
- 0
- cc_by
- NaN
-
-
- 8592
- 608
- 0
-
- Must link to published article ; Publisher cop...
-
-
- 8593
- 608
- 0
- cc_by
- NaN
-
-
- 8594
- 608
- 0
- cc_by
- NaN
-
-
-
-
1118 rows × 4 columns
-
-
-
-
-
-```python
-# dedup
-sherpa_policies_ir_id = sherpa_policies_ir[['journal', 'embargo']].sort_values(by=['journal', 'embargo'])
-sherpa_policies_ir_dedup = sherpa_policies_ir_id.drop_duplicates(subset='journal')
-sherpa_policies_ir_dedup
-```
-
-
-
-
-
-
-
-
-
-
- journal
- embargo
-
-
-
-
- 2367
- 2
- 0
-
-
- 8342
- 3
- 0
-
-
- 7366
- 5
- 0
-
-
- 261
- 6
- 12
-
-
- 7086
- 7
- 0
-
-
- ...
- ...
- ...
-
-
- 6479
- 996
- 0
-
-
- 6873
- 997
- 0
-
-
- 1823
- 998
- 0
-
-
- 3944
- 999
- 0
-
-
- 6750
- 1000
- 0
-
-
-
-
579 rows × 2 columns
-
-
-
-
-
-```python
-# ajout de la ctégorie green (2)
-sherpa_policies_ir_dedup['oa_status'] = 2
-sherpa_policies_ir_dedup
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
-
-
-
-
-
-
-
-
-
-
-
- journal
- embargo
- oa_status
-
-
-
-
- 2367
- 2
- 0
- 2
-
-
- 8342
- 3
- 0
- 2
-
-
- 7366
- 5
- 0
- 2
-
-
- 261
- 6
- 12
- 2
-
-
- 7086
- 7
- 0
- 2
-
-
- ...
- ...
- ...
- ...
-
-
- 6479
- 996
- 0
- 2
-
-
- 6873
- 997
- 0
- 2
-
-
- 1823
- 998
- 0
- 2
-
-
- 3944
- 999
- 0
- 2
-
-
- 6750
- 1000
- 0
- 2
-
-
-
-
579 rows × 3 columns
-
-
-
-
-
-```python
-# merge avec les revues
-sherpa_policies_ir_dedup = sherpa_policies_ir_dedup.rename(columns={'journal' : 'id'})
-journals_export = pd.merge(journals_export, sherpa_policies_ir_dedup, on='id', how='left')
-journals_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- oa_status_x
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
- embargo
- oa_status_y
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
-
- 215
- 138
- 1
- 1
- 0
- 0
- 0
- 0
- 0
- NaN
- NaN
- NaN
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 1
- 2
- 0
- 0
- 0
- 1
- 0
- NaN
- 0
- 2.0
-
-
- 2
- 3
- PLoS ONE
-
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 5
- 3
- 1
- 1
- 1
- 0
- 0
- 4.035714
- 0
- 2.0
-
-
- 3
- 4
- EU-topías
- EU-topías
- 2011
- 9999
-
- 209
- 124, 138, 402, 292
- 1
- 4, 5
- 0
- 0
- 0
- 0
- 0
- NaN
- NaN
- NaN
-
-
- 4
- 5
- Physical review B: Condensed matter and materi...
- Phys. rev., B, Condens. matter mater. phys.
- 1998
- 2015
- http://journals.aps.org/prb/
- 236
- 124
- 1
- 6
- 0
- 0
- 0
- 1
- 0
- NaN
- 0
- 2.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- Smart Materials and Structures
- Smart mater. struct. (Print)
- 1992
- 9999
- http://iopscience.iop.org/0964-1726
- 234
- 124
- 1
- 47
- 0
- 0
- 0
- 1
- 0
- NaN
- 0
- 2.0
-
-
- 907
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 1
- 75
- 0
- 0
- 0
- 1
- 0
- NaN
- 0
- 2.0
-
-
- 908
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 1
- 8
- 0
- 0
- 1
- 1
- 1
- NaN
- 0
- 2.0
-
-
- 909
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 1
- 119
- 0
- 0
- 0
- 1
- 0
- NaN
- 0
- 2.0
-
-
- 910
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 1
- 217
- 0
- 0
- 0
- 0
- 0
- NaN
- NaN
- NaN
-
-
-
-
911 rows × 18 columns
-
-
-
-
-
-```python
-# choix de la catégorie OA
-journals_export['oa_status'] = journals_export['oa_status_x']
-journals_export.loc[(journals_export['oa_status_x'] == 1) & (journals_export['oa_status_y'].notna()), 'oa_status'] = journals_export['oa_status_y']
-journals_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- oa_status_x
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
- embargo
- oa_status_y
- oa_status
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
-
- 215
- 138
- 1
- 1
- 0
- 0
- 0
- 0
- 0
- NaN
- NaN
- NaN
- 1.0
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 1
- 2
- 0
- 0
- 0
- 1
- 0
- NaN
- 0
- 2.0
- 2.0
-
-
- 2
- 3
- PLoS ONE
-
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 5
- 3
- 1
- 1
- 1
- 0
- 0
- 4.035714
- 0
- 2.0
- 5.0
-
-
- 3
- 4
- EU-topías
- EU-topías
- 2011
- 9999
-
- 209
- 124, 138, 402, 292
- 1
- 4, 5
- 0
- 0
- 0
- 0
- 0
- NaN
- NaN
- NaN
- 1.0
-
-
- 4
- 5
- Physical review B: Condensed matter and materi...
- Phys. rev., B, Condens. matter mater. phys.
- 1998
- 2015
- http://journals.aps.org/prb/
- 236
- 124
- 1
- 6
- 0
- 0
- 0
- 1
- 0
- NaN
- 0
- 2.0
- 2.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- Smart Materials and Structures
- Smart mater. struct. (Print)
- 1992
- 9999
- http://iopscience.iop.org/0964-1726
- 234
- 124
- 1
- 47
- 0
- 0
- 0
- 1
- 0
- NaN
- 0
- 2.0
- 2.0
-
-
- 907
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 1
- 75
- 0
- 0
- 0
- 1
- 0
- NaN
- 0
- 2.0
- 2.0
-
-
- 908
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 1
- 8
- 0
- 0
- 1
- 1
- 1
- NaN
- 0
- 2.0
- 2.0
-
-
- 909
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 1
- 119
- 0
- 0
- 0
- 1
- 0
- NaN
- 0
- 2.0
- 2.0
-
-
- 910
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 1
- 217
- 0
- 0
- 0
- 0
- 0
- NaN
- NaN
- NaN
- 1.0
-
-
-
-
911 rows × 19 columns
-
-
-
-
-
-```python
-# 6 : Diamond
-# 5 : Gold
-# 4 : Full
-# 3 : Hybrid
-# 2 : Green
-# 1 : UNKNOWN
-journals_export['oa_status'].value_counts()
-```
-
-
-
-
- 2.0 518
- 1.0 306
- 5.0 70
- 6.0 17
- Name: oa_status, dtype: int64
-
-
-
-
-```python
-del journals_export['embargo']
-del journals_export['oa_status_x']
-del journals_export['oa_status_y']
-journals_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
- oa_status
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
-
- 215
- 138
- 1
- 0
- 0
- 0
- 0
- 0
- NaN
- 1.0
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 2
- 0
- 0
- 0
- 1
- 0
- NaN
- 2.0
-
-
- 2
- 3
- PLoS ONE
-
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 3
- 1
- 1
- 1
- 0
- 0
- 4.035714
- 5.0
-
-
- 3
- 4
- EU-topías
- EU-topías
- 2011
- 9999
-
- 209
- 124, 138, 402, 292
- 4, 5
- 0
- 0
- 0
- 0
- 0
- NaN
- 1.0
-
-
- 4
- 5
- Physical review B: Condensed matter and materi...
- Phys. rev., B, Condens. matter mater. phys.
- 1998
- 2015
- http://journals.aps.org/prb/
- 236
- 124
- 6
- 0
- 0
- 0
- 1
- 0
- NaN
- 2.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- Smart Materials and Structures
- Smart mater. struct. (Print)
- 1992
- 9999
- http://iopscience.iop.org/0964-1726
- 234
- 124
- 47
- 0
- 0
- 0
- 1
- 0
- NaN
- 2.0
-
-
- 907
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 75
- 0
- 0
- 0
- 1
- 0
- NaN
- 2.0
-
-
- 908
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 8
- 0
- 0
- 1
- 1
- 1
- NaN
- 2.0
-
-
- 909
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 119
- 0
- 0
- 0
- 1
- 0
- NaN
- 2.0
-
-
- 910
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 217
- 0
- 0
- 0
- 0
- 0
- NaN
- 1.0
-
-
-
-
911 rows × 16 columns
-
-
-
-
-
-```python
-journals_export['oa_status'] = journals_export['oa_status'].astype(int)
-journals_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
- oa_status
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
-
- 215
- 138
- 1
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 2
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
-
-
- 2
- 3
- PLoS ONE
-
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 3
- 1
- 1
- 1
- 0
- 0
- 4.035714
- 5
-
-
- 3
- 4
- EU-topías
- EU-topías
- 2011
- 9999
-
- 209
- 124, 138, 402, 292
- 4, 5
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
-
-
- 4
- 5
- Physical review B: Condensed matter and materi...
- Phys. rev., B, Condens. matter mater. phys.
- 1998
- 2015
- http://journals.aps.org/prb/
- 236
- 124
- 6
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- Smart Materials and Structures
- Smart mater. struct. (Print)
- 1992
- 9999
- http://iopscience.iop.org/0964-1726
- 234
- 124
- 47
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
-
-
- 907
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 75
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
-
-
- 908
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 8
- 0
- 0
- 1
- 1
- 1
- NaN
- 2
-
-
- 909
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 119
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
-
-
- 910
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 217
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
-
-
-
-
911 rows × 16 columns
-
-
-
-
-
-```python
-# export csv
-journals_export.to_csv('sample/journal_fin_sherpa.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-journals_export.to_excel('sample/journal_fin_sherpa.xlsx', index=False)
-```
-
-
-```python
-# export csv
-sherpa_policies_ir_dedup.to_csv('sample/journal_ir.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-sherpa_policies_ir_dedup.to_excel('sample/journal_ir.xlsx', index=False)
-```
-
-
-```python
-
-```
diff --git a/import_scripts/06_oacct_sherpa.py b/import_scripts/06_oacct_sherpa.py
deleted file mode 100644
index 158819fd..00000000
--- a/import_scripts/06_oacct_sherpa.py
+++ /dev/null
@@ -1,1107 +0,0 @@
-#!/usr/bin/env python
-# coding: utf-8
-
-# # Projet Open Access Compliance Check Tool (OACCT)
-#
-# Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-#
-# Ce notebook permet d'extraire les données de Sherpa/Romeo obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-#
-# Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-# Date de dernière mise à jour : 16.07.2021
-
-# ## Données de Sherpa/Romeo
-#
-# ### Exemple
-#
-# https://v2.sherpa.ac.uk/cgi/retrieve_by_id?item-type=publication&api-key=EEE6F146-678E-11EB-9C3A-202F3DE2659A&format=Json&identifier=17601
-
-# In[1]:
-
-
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-# afficher toutes les colonnes
-pd.set_option('display.max_columns', None)
-
-
-# ## Table publisher_sherpa
-
-# In[2]:
-
-
-# creation du DF
-col_names = ['journal',
- 'publisher_id',
- 'name',
- 'country',
- 'type',
- 'url'
- ]
-publisher_sherpa = pd.DataFrame(columns = col_names)
-publisher_sherpa
-
-
-# ## Table sherpa match issn
-
-# In[3]:
-
-
-# creation du DF
-col_names = ['issn',
- 'sherpa_match',
- ]
-sherpa_match_issn = pd.DataFrame(columns = col_names)
-sherpa_match_issn
-
-
-# ## Table sherpa issns
-
-# In[4]:
-
-
-# creation du DF
-col_names = ['issn',
- 'type',
- ]
-sherpa_issn = pd.DataFrame(columns = col_names)
-sherpa_issn
-
-
-# ## Table sherpa journals
-
-# In[5]:
-
-
-# creation du DF
-col_names = ['journal',
- 'title',
- 'url',
- ]
-sherpa_journal = pd.DataFrame(columns = col_names)
-sherpa_journal
-
-
-# ## Import table Journals et ISSN
-
-# In[6]:
-
-
-journal = pd.read_csv('sample/journals_publishers_brut.tsv', encoding='utf-8', header=0, sep='\t')
-journal
-
-
-# In[7]:
-
-
-issn = pd.read_csv('sample/issn_brut.tsv', encoding='utf-8', header=0, sep='\t')
-issn
-
-
-# In[8]:
-
-
-issn_ids = pd.read_csv('sample/issn_ids.tsv', encoding='utf-8', header=0, sep='\t')
-issn_ids
-
-
-# ## Extraction de Sherpa Romeo
-
-# In[9]:
-
-
-# extraction des informations à partir des données Sherpa/Romeo
-for index, row in issn.iterrows():
- journal_id = row['journal']
- journal_issn = row['issn']
- # if (((index/10) - int(index/10)) == 0) :
- # print(index)
- # initialisation des variables à extraire
- publisher_id = np.nan
- publisher_name = ''
- publisher_country = ''
- publisher_type = ''
- publisher_url = ''
- # boucle des fichiers json
- # test d'existance du fichier
- # print(row['issn'])
- if os.path.exists('sherpa/data/' + journal_issn + '.json'):
- with open('sherpa/data/' + journal_issn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- if (len(data['items']) > 0):
- publisher_id = data['items'][0]['publishers'][0]['publisher']['id']
- if ('country' in data['items'][0]['publishers'][0]['publisher']):
- publisher_country = data['items'][0]['publishers'][0]['publisher']['country']
- if ('relationship_type' in data['items'][0]['publishers'][0]):
- publisher_type = data['items'][0]['publishers'][0]['relationship_type']
- if ('url' in data['items'][0]['publishers'][0]['publisher']):
- publisher_url = data['items'][0]['publishers'][0]['publisher']['url']
- if ('name' in data['items'][0]['publishers'][0]['publisher']['name'][0]):
- publisher_name = data['items'][0]['publishers'][0]['publisher']['name'][0]['name']
- sherpa_match = 'OK'
- publisher_sherpa = publisher_sherpa.append({'journal' : journal_id, 'publisher_id' : publisher_id,
- 'name' : publisher_name, 'country' : publisher_country,
- 'type' : publisher_type, 'url' : publisher_url}, ignore_index=True)
- else :
- print(row['issn'] + ' - trouvé mais vide')
- sherpa_match = 'empty'
- else :
- print(row['issn'] + ' - pas trouvé')
- sherpa_match = 'missing'
- sherpa_match_issn = sherpa_match_issn.append({'issn' : row['issn'], 'sherpa_match' : sherpa_match}, ignore_index=True)
-
-
-# In[10]:
-
-
-publisher_sherpa
-
-
-# In[11]:
-
-
-sherpa_match_issn
-
-
-# In[12]:
-
-
-# dedup
-publisher_sherpa_dedup = publisher_sherpa.drop_duplicates()
-publisher_sherpa_dedup
-
-
-# In[13]:
-
-
-sherpa_match_issn
-
-
-# In[14]:
-
-
-# ajout du issnl et du titre
-sherpa_match_issn = pd.merge(sherpa_match_issn, issn_ids, on='issn', how='left')
-sherpa_match_issn = pd.merge(sherpa_match_issn, journal[['issnl', 'title']], on='issnl', how='left')
-sherpa_match_issn
-
-
-# In[15]:
-
-
-sherpa_match_results = sherpa_match_issn[['id', 'issnl', 'sherpa_match']].groupby(['issnl', 'sherpa_match']).count()
-sherpa_match_results
-
-
-# In[16]:
-
-
-sherpa_match_results = sherpa_match_results.reset_index()
-sherpa_match_results
-
-
-# In[17]:
-
-
-sherpa_match_results_ok = sherpa_match_results.loc[sherpa_match_results['sherpa_match'] == 'OK']
-issn_ids_issnl = issn_ids[['issnl', 'journal']].drop_duplicates(subset='issnl')
-issn_ids_issnl = pd.merge(issn_ids_issnl, sherpa_match_results_ok, on='issnl', how='left')
-issn_ids_issnl = pd.merge(issn_ids_issnl, journal[['issnl', 'title']], on='issnl', how='left')
-issn_ids_issnl
-
-
-# In[18]:
-
-
-journals_not_sherpa = issn_ids_issnl.loc[issn_ids_issnl['sherpa_match'].isna()]
-journals_not_sherpa
-
-
-# In[19]:
-
-
-sherpa_match_results_empty = sherpa_match_results.loc[sherpa_match_results['sherpa_match'] == 'empty']
-sherpa_match_results_missing = sherpa_match_results.loc[sherpa_match_results['sherpa_match'] == 'missing']
-del journals_not_sherpa['sherpa_match']
-del journals_not_sherpa['id']
-journals_not_sherpa = pd.merge(journals_not_sherpa, sherpa_match_results_empty, on='issnl', how='left')
-del journals_not_sherpa['id']
-journals_not_sherpa = pd.merge(journals_not_sherpa, sherpa_match_results_missing, on='issnl', how='left')
-del journals_not_sherpa['id']
-journals_not_sherpa
-
-
-# In[20]:
-
-
-# extraction des informations des journaux à partir des données Sherpa/Romeo
-for index, row in issn.iterrows():
- journal_id = row['journal']
- journal_issn = row['issn']
- # boucle des fichiers json
- # test d'existance du fichier
- # print(row['format'])
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- if os.path.exists('sherpa/data/' + journal_issn + '.json'):
- with open('sherpa/data/' + journal_issn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- title = np.nan
- url = np.nan
- if (len(data['items']) > 0):
- if ('url' in data['items'][0]):
- url = data['items'][0]['url']
- if ('title' in data['items'][0]['title'][0]):
- title = data['items'][0]['title'][0]['title']
- sherpa_journal = sherpa_journal.append({'journal' : journal_id, 'title' : title, 'url' : url}, ignore_index=True)
-
-
-# In[21]:
-
-
-sherpa_journal
-
-
-# In[22]:
-
-
-# extraction des informations à partir des données Sherpa/Romeo
-for index, row in issn.iterrows():
- journal_id = row['journal']
- journal_issn = row['issn']
- # boucle des fichiers json
- # test d'existance du fichier
- # print(row['format'])
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- if os.path.exists('sherpa/data/' + journal_issn + '.json'):
- with open('sherpa/data/' + journal_issn + '.json', 'r', encoding='utf-8') as f:
- myissn = np.nan
- mytype = np.nan
- data = json.load(f)
- if (len(data['items']) > 0):
- if ('issns' in data['items'][0]):
- issns = data['items'][0]['issns']
- for i in issns:
- if ('issn' in i):
- myissn = i['issn']
- if ('type' in i):
- mytype = i['type']
- sherpa_issn = sherpa_issn.append({'issn' : myissn, 'type' : mytype}, ignore_index=True)
-
-
-# In[23]:
-
-
-sherpa_issn
-
-
-# In[24]:
-
-
-# dedup
-sherpa_issn = sherpa_issn.drop_duplicates()
-sherpa_issn
-
-
-# In[25]:
-
-
-# completer le fichier des issns avec les types de sherpa
-issn2 = pd.merge(issn, sherpa_issn, on='issn', how='left')
-issn2
-
-
-# In[26]:
-
-
-# exports csv
-publisher_sherpa_dedup.to_csv('sample/publisher_sherpa.tsv', sep='\t', encoding='utf-8', index=False)
-sherpa_match_issn.to_csv('sample/sherpa_match_issn.tsv', sep='\t', encoding='utf-8', index=False)
-sherpa_journal.to_csv('sample/sherpa_journal.tsv', sep='\t', encoding='utf-8', index=False)
-issn2.to_csv('sample/issn_sherpa.tsv', sep='\t', encoding='utf-8', index=False)
-journals_not_sherpa.to_csv('sample/journals_not_sherpa.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[27]:
-
-
-# exports excel
-publisher_sherpa_dedup.to_excel('sample/publisher_sherpa.xlsx', index=False)
-sherpa_match_issn.to_excel('sample/sherpa_match_issn.xlsx', index=False)
-sherpa_journal.to_excel('sample/sherpa_journal.xlsx', index=False)
-issn2.to_excel('sample/issn_sherpa.xlsx', index=False)
-journals_not_sherpa.to_excel('sample/journals_not_sherpa.xlsx', index=False)
-
-
-# In[28]:
-
-
-# ajout des titres Sherpa a la table des revues
-# renommer les colonnes
-sherpa_journal = sherpa_journal.rename(columns={'journal' : 'id'})
-journal = pd.merge(journal, sherpa_journal, on='id', how='left')
-journal
-
-
-# In[29]:
-
-
-# choix du titre et url
-journal['url'] = journal['url_y']
-journal.loc[journal['url_y'].isna(), 'url'] = journal['url_x']
-journal['title'] = journal['title_y']
-journal.loc[journal['title_y'].isna(), 'title'] = journal['title_x']
-journal
-
-
-# In[30]:
-
-
-journals_export = journal[['id', 'title', 'name_short_iso_4', 'starting_year', 'end_year', 'url', 'country', 'language', 'oa_status', 'publisher', 'doaj_seal', 'doaj_status', 'lockss', 'portico', 'nlch', 'qoam_av_score']]
-journals_export
-
-
-# In[31]:
-
-
-# renommage des champs finaux
-journals_export = journals_export.rename(columns={'title' : 'name', 'url' : 'website'})
-# remplacement des vides et id à int
-journals_export['starting_year'] = journals_export['starting_year'].fillna(0)
-journals_export['end_year'] = journals_export['end_year'].fillna(9999)
-journals_export['name_short_iso_4'] = journals_export['name_short_iso_4'].fillna('')
-journals_export['website'] = journals_export['website'].fillna('')
-journals_export['doaj_seal'] = journals_export['doaj_seal'].fillna('0')
-journals_export['country'] = journals_export['country'].fillna('999999')
-journals_export['language'] = journals_export['language'].fillna('999999')
-journals_export['doaj_status'] = journals_export['doaj_status'].astype(int)
-journals_export['doaj_seal'] = journals_export['doaj_seal'].astype(int)
-journals_export['lockss'] = journals_export['lockss'].astype(int)
-journals_export['portico'] = journals_export['portico'].astype(int)
-journals_export['nlch'] = journals_export['nlch'].astype(int)
-journals_export
-
-
-# In[32]:
-
-
-journals_export = journals_export.drop_duplicates(subset='id')
-journals_export
-
-
-# In[33]:
-
-
-# test journaux sans titre
-journals_export.loc[journals_export['name'].isna()]
-
-
-# In[34]:
-
-
-# export et suppression des journaux sans titre
-# export csv
-journals_export.loc[journals_export['name'].isna()].to_csv('sample/sherpa_journals_without_title.tsv', sep='\t', encoding='utf-8', index=False)
-# export excel
-journals_export.loc[journals_export['name'].isna()].to_excel('sample/sherpa_journals_without_title.xlsx', index=False)
-journals_export = journals_export.loc[journals_export['name'].notna()]
-journals_export
-
-
-# In[35]:
-
-
-journals_export.loc[journals_export['name'].str.contains('(Print)')]
-
-
-# In[36]:
-
-
-journals_export.loc[journals_export['name'].str.contains('(Online)')]
-
-
-# In[37]:
-
-
-# remplacement des mentions " (Print)" et " (Online)" dans les titres
-journals_export['name'] = journals_export['name'].str.replace('(Print)', '')
-journals_export['name'] = journals_export['name'].str.replace('(Online)', '')
-journals_export
-
-
-# In[38]:
-
-
-journals_export.loc[journals_export['name'].str.contains('(Print)')]
-
-
-# In[39]:
-
-
-journals_export.loc[journals_export['name'].str.contains('(Online)')]
-
-
-# ## Table sherpa_policies
-
-# In[40]:
-
-
-# creation du DF
-col_names = ['journal',
- 'issn',
- 'sherpa_id',
- 'sherpa_uri',
- 'open_access_prohibited',
- 'additional_oa_fee',
- 'article_version',
- 'license',
- 'embargo',
- 'prerequisites',
- 'prerequisite_funders',
- 'prerequisite_funders_name',
- 'prerequisite_funders_fundref',
- 'prerequisite_funders_ror',
- 'prerequisite_funders_country',
- 'prerequisite_funders_url',
- 'prerequisite_funders_sherpa_id',
- 'prerequisite_subjects',
- 'location',
- 'locations_ir',
- 'locations_not_ir',
- 'named_repository',
- 'named_academic_social_network',
- 'copyright_owner',
- 'publisher_deposit',
- 'archiving',
- 'conditions',
- 'public_notes'
- ]
-sherpa_policies = pd.DataFrame(columns = col_names)
-sherpa_policies
-
-
-# In[41]:
-
-
-# dédoublonage par journal id
-issn_dedup = issn.drop_duplicates(subset='journal')
-issn_dedup
-
-
-# In[42]:
-
-
-# type de repositories qui provoquent archiving = 1 :
-# tous les types : 'academic_social_network', 'any_repository', 'any_website', 'authors_homepage',
-# 'funder_designated_location', 'institutional_repository', 'institutional_website', 'named_academic_social_network',
-# 'named_repository', 'non_commercial_institutional_repository', 'non_commercial_repository',
-# 'non_commercial_social_network', 'non_commercial_subject_repository', 'non_commercial_website',
-# 'preprint_repository', 'subject_repository', 'this_journal'
-repositories_archiving = ['any_repository',
- 'institutional_repository',
- 'institutional_website',
- 'non_commercial_institutional_repository',
- 'non_commercial_repository',
- 'any_website',
- 'non_commercial_website']
-
-# extraction des termes
-for index, row in issn_dedup.iterrows():
- journal_id = row['journal']
- journal_issn = row['issn']
- # boucle des fichiers json
- # print(row['format'])
- if (((index/10) - int(index/10)) == 0) :
- print(index)
- # test d'existance du fichier
- if os.path.exists('sherpa/data/' + journal_issn + '.json'):
- with open('sherpa/data/' + journal_issn + '.json', 'r', encoding='utf-8') as f:
- data = json.load(f)
- # initialisation des variables à extraire
- sherpa_id = np.nan
- sherpa_uri = np.nan
- open_access_prohibited = np.nan
- location = np.nan
- locations_ir = ''
- locations_not_ir = ''
- additional_oa_fee = np.nan
- article_versions = np.nan
- article_version = np.nan
- licenses = []
- embargo = 0
- prerequisites = np.nan
- prerequisite_funders = np.nan
- prerequisite_funders_name = np.nan
- prerequisite_funders_fundref = np.nan
- prerequisite_funders_ror = np.nan
- prerequisite_funders_country = np.nan
- prerequisite_funders_url = np.nan
- prerequisite_funders_sherpa_id = np.nan
- prerequisite_subjects = np.nan
- named_repository = np.nan
- named_academic_social_network = np.nan
- copyright_owner = np.nan
- publisher_deposit = np.nan
- archiving = np.nan
- conditions = np.nan
- public_notes = np.nan
- if (len(data['items']) > 0):
- if ('id' in data['items'][0]):
- sherpa_id = data['items'][0]['id']
- # test si l'id est déjà présent
- if sherpa_id in sherpa_policies['sherpa_id'] :
- print('SKIP ' + str(sherpa_id))
- else :
- poilicies = data['items'][0]['publisher_policy']
- for poilicy in poilicies:
- # initialisation des variables à extraire
- sherpa_uri = np.nan
- open_access_prohibited = np.nan
- if ('uri' in poilicy):
- sherpa_uri = poilicy['uri']
- if ('open_access_prohibited' in poilicy):
- open_access_prohibited = poilicy['open_access_prohibited']
- if ('permitted_oa' in poilicy):
- poas = poilicy['permitted_oa']
- for poa in poas:
- additional_oa_fee = np.nan
- article_versions = np.nan
- article_version = np.nan
- licenses = []
- embargo = 0
- prerequisites = np.nan
- prerequisite_funders = np.nan
- prerequisite_funders_name = np.nan
- prerequisite_funders_fundref = np.nan
- prerequisite_funders_ror = np.nan
- prerequisite_funders_country = np.nan
- prerequisite_funders_url = np.nan
- prerequisite_funders_sherpa_id = np.nan
- prerequisite_subjects = np.nan
- named_repository = np.nan
- named_academic_social_network = np.nan
- locations_ir = ''
- locations_not_ir = ''
- copyright_owner = np.nan
- conditions = np.nan
- public_notes = np.nan
- if ('additional_oa_fee' in poa):
- additional_oa_fee = poa['additional_oa_fee']
- if ('location' in poa):
- archiving = 0
- location = ''
- mylocations = poa['location']['location']
- mylocations_text = poa['location']['location_phrases']
- if (type(mylocations) is not list):
- mylocations = [mylocations]
- location = ' ; '.join(mylocations)
- for locationi in mylocations:
- if locationi in repositories_archiving :
- archiving = archiving + 1
- for locationi_text in mylocations_text:
- if locationi_text['value'] == locationi :
- if locations_ir == '':
- locations_ir = locations_ir + locationi_text['phrase']
- else :
- if locationi_text['phrase'] not in locations_ir :
- locations_ir = locations_ir + ' ; ' + locationi_text['phrase']
- else :
- for locationi_text in mylocations_text:
- if locationi_text['value'] == locationi :
- if locations_not_ir == '':
- locations_not_ir = locations_not_ir + locationi_text['phrase']
- else :
- if locationi_text['phrase'] not in locations_not_ir :
- locations_not_ir = locations_not_ir + ' ; ' + locationi_text['phrase']
- # print (archiving)
- if archiving > 0:
- archiving = True
- else :
- archiving = False
- if ('named_repository' in poa['location']):
- if (type(poa['location']['named_repository']) is list):
- named_repository = ' ; '.join(poa['location']['named_repository'])
- else :
- named_repository = poa['location']['named_repository']
- locations_not_ir = locations_not_ir.replace('Named Repository', named_repository)
- locations_ir = locations_ir.replace('Named Repository', named_repository)
- if ('named_academic_social_network' in poa['location']):
- if (type(poa['location']['named_academic_social_network']) is list):
- named_academic_social_network = ' ; '.join(poa['location']['named_academic_social_network'])
- else :
- named_academic_social_network = poa['location']['named_academic_social_network']
- locations_not_ir = locations_not_ir.replace('Named Academic Social Network', named_academic_social_network)
- locations_ir = locations_ir.replace('Named Academic Social Network', named_academic_social_network)
- if ('embargo' in poa):
- # print(poa['embargo'])
- embargo_amount = 0
- if ('amount' in poa['embargo']):
- embargo_amount = poa['embargo']['amount']
- if ('units' in poa['embargo']):
- if (poa['embargo']['units'] == 'months') :
- embargo = embargo_amount
- elif (poa['embargo']['units'] == 'years') :
- embargo = embargo_amount*12
- elif (poa['embargo']['units'] == 'weeks') :
- embargo = int(embargo_amount/4)
- if (embargo == 0):
- embargo = 1
- elif (poa['embargo']['units'] == 'days') :
- embargo = int(embargo_amount/30)
- if (embargo == 0):
- embargo = 1
- else :
- embargo = embargo_amount
- if ('prerequisites' in poa):
- if 'prerequisites' in poa['prerequisites'] :
- if (type(poa['prerequisites']['prerequisites']) is list):
- prerequisites = ' ; '.join(poa['prerequisites']['prerequisites'])
- else:
- prerequisites = poa['prerequisites']['prerequisites']
- if ('prerequisite_funders' in poa['prerequisites']):
- prerequisite_funders = True
- # prerequisite_funders = poa['prerequisites']['prerequisite_funders']
- # if (type(poa['prerequisites']['prerequisite_funders']) is list):
- # prerequisite_funders = ' ; '.join(poa['prerequisites']['prerequisite_funders'])
- # else:
- # prerequisite_funders = poa['prerequisites']['prerequisite_funders']
- if ('prerequisite_subjects' in poa['prerequisites']):
- prerequisite_subjects = True
- # prerequisite_subjects = poa['prerequisites']['prerequisite_subjects']
- # if (type(poa['prerequisite_subjects']) is list):
- # prerequisite_subjects = ' ; '.join(poa['prerequisite_subjects'])
- # else:
- # prerequisite_subjects = poa['prerequisite_subjects']
- if ('copyright_owner' in poa):
- copyright_owner = poa['copyright_owner']
- if ('publisher_deposit' in poa):
- publisher_deposit = ''
- if (type(poa['publisher_deposit']) is list):
- for deposit in poa['publisher_deposit']:
- if 'type' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + deposit['repository_metadata']['type']
- if 'name' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + ' (' + deposit['repository_metadata']['name'][0]['name'] + ')'
- else :
- if 'name' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + deposit['repository_metadata']['name'][0]['name']
- publisher_deposit = publisher_deposit + ' ; '
- else :
- deposit = poa['publisher_deposit']
- if 'type' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + deposit['repository_metadata']['type']
- if 'name' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + ' (' + deposit['repository_metadata']['name'][0]['name'] + ')'
- else :
- if 'name' in deposit['repository_metadata']:
- publisher_deposit = publisher_deposit + deposit['repository_metadata']['name'][0]['name']
- publisher_deposit = publisher_deposit + ' ; '
- # print (publisher_deposit)
- if ('conditions' in poa):
- if (type(poa['conditions']) is list):
- conditions = ' ; '.join(poa['conditions'])
- else:
- conditions = poa['conditions']
- if ('public_notes' in poa):
- if (type(poa['public_notes']) is list):
- public_notes = ' ; '.join(poa['public_notes'])
- else:
- public_notes = poa['public_notes']
- if ('license' in poa):
- licenses = poa['license']
- if (type(licenses) is not list):
- licenses = [licenses]
- else :
- licenses = ['']
- # avec article version
- if ('article_version' in poa):
- article_versions = poa['article_version']
- for article_version in article_versions:
- for license in licenses:
- if ('license' in license):
- mylicense = license['license']
- else :
- mylicense = ''
- # avec prerequisites
- if ('prerequisites' in poa) :
- # avec prerequisites_funders
- if ('prerequisite_funders' in poa['prerequisites']):
- for prerequisite_fundersi in poa['prerequisites']['prerequisite_funders'] :
- prerequisite_funders_name = prerequisite_fundersi['funder_metadata']['name'][0]['name']
- if 'acronym' in prerequisite_fundersi['funder_metadata']['name'][0]:
- prerequisite_funders_name = prerequisite_funders_name + ' (' + prerequisite_fundersi['funder_metadata']['name'][0]['acronym'] + ')'
- if 'identifiers' in prerequisite_fundersi['funder_metadata'] :
- for fund_identifier in prerequisite_fundersi['funder_metadata']['identifiers'] :
- if fund_identifier['type'] == 'fundref':
- prerequisite_funders_fundref = fund_identifier['identifier']
- if fund_identifier['type'] == 'ror':
- prerequisite_funders_ror = fund_identifier['identifier']
- if 'country' in prerequisite_fundersi['funder_metadata']:
- prerequisite_funders_country = prerequisite_fundersi['funder_metadata']['country']
- if 'url' in prerequisite_fundersi['funder_metadata']:
- prerequisite_funders_url = prerequisite_fundersi['funder_metadata']['url'][0]['url']
- prerequisite_funders_sherpa_id = prerequisite_fundersi['funder_metadata']['id']
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
- # sans prerequisites_funders
- else :
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
- # sans prerequisites
- else :
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
-
- # sans article version
- else :
- if (type(licenses) is not list):
- licenses = [licenses]
- for license in licenses:
- if ('license' in license):
- mylicense = license['license']
- else :
- mylicense = ''
- # avec prerequisites
- if ('prerequisites' in poa) :
- # avec prerequisites_funders
- if ('prerequisite_funders' in poa['prerequisites']):
- for prerequisite_fundersi in poa['prerequisites']['prerequisite_funders'] :
- prerequisite_funders_name = prerequisite_fundersi['funder_metadata']['name'][0]['name']
- if 'acronym' in prerequisite_fundersi['funder_metadata']['name'][0]:
- prerequisite_funders_name = prerequisite_funders_name + ' (' + prerequisite_fundersi['funder_metadata']['name'][0]['acronym'] + ')'
- if 'identifiers' in prerequisite_fundersi['funder_metadata'] :
- for fund_identifier in prerequisite_fundersi['funder_metadata']['identifiers'] :
- if fund_identifier['type'] == 'fundref':
- prerequisite_funders_fundref = fund_identifier['identifier']
- if fund_identifier['type'] == 'ror':
- prerequisite_funders_ror = fund_identifier['identifier']
- if 'country' in prerequisite_fundersi['funder_metadata']:
- prerequisite_funders_country = prerequisite_fundersi['funder_metadata']['country']
- if 'url' in prerequisite_fundersi['funder_metadata']:
- prerequisite_funders_url = prerequisite_fundersi['funder_metadata']['url'][0]['url']
- prerequisite_funders_sherpa_id = prerequisite_fundersi['funder_metadata']['id']
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
- # sans prerequisites_funders
- else :
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
- # sans prerequisites
- else :
- sherpa_policies = sherpa_policies.append({'journal' : journal_id,
- 'issn' : journal_issn,
- 'sherpa_id' : sherpa_id,
- 'sherpa_uri' : sherpa_uri,
- 'open_access_prohibited' : open_access_prohibited,
- 'additional_oa_fee' : additional_oa_fee,
- 'article_version' : article_version,
- 'license' : mylicense,
- 'embargo' : embargo,
- 'prerequisites' : prerequisites,
- 'prerequisite_funders' : prerequisite_funders,
- 'prerequisite_funders_name' : prerequisite_funders_name,
- 'prerequisite_funders_fundref' : prerequisite_funders_fundref,
- 'prerequisite_funders_ror' : prerequisite_funders_ror,
- 'prerequisite_funders_country' : prerequisite_funders_country,
- 'prerequisite_funders_url' : prerequisite_funders_url,
- 'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
- 'prerequisite_subjects' : prerequisite_subjects,
- 'location' : location,
- 'locations_ir' : locations_ir,
- 'locations_not_ir' : locations_not_ir,
- 'named_repository' : named_repository,
- 'named_academic_social_network' : named_academic_social_network,
- 'copyright_owner' : copyright_owner,
- 'publisher_deposit' : publisher_deposit,
- 'archiving' : archiving,
- 'conditions' : conditions,
- 'public_notes' : public_notes
- }, ignore_index=True)
- # sans permitted_oa
- else :
- print ('permitted_oa MISSING')
- else :
- print ('id MISSING')
-
-
-# In[43]:
-
-
-sherpa_policies
-
-
-# In[44]:
-
-
-# convertir l'index en id
-sherpa_policies = sherpa_policies.reset_index()
-# ajout de l'id avec l'index + 1
-sherpa_policies['id'] = sherpa_policies['index'] + 1
-del sherpa_policies['index']
-sherpa_policies
-
-
-# In[45]:
-
-
-# export csv
-sherpa_policies.to_csv('sample/sherpa_policies_brut.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[46]:
-
-
-# export excel
-sherpa_policies.to_excel('sample/sherpa_policies_brut.xlsx', index=False)
-
-
-# ## Calcul de la catégorie "green" et export final des journaux
-
-# In[47]:
-
-
-sherpa_policies
-
-
-# In[48]:
-
-
-sherpa_policies_ir = sherpa_policies.loc[(sherpa_policies['archiving'] == True) & (sherpa_policies['article_version'] == 'published') & (sherpa_policies['prerequisite_funders'].isna())][['journal', 'embargo', 'license', 'conditions']]
-sherpa_policies_ir
-
-
-# In[49]:
-
-
-# dedup
-sherpa_policies_ir_id = sherpa_policies_ir[['journal', 'embargo']].sort_values(by=['journal', 'embargo'])
-sherpa_policies_ir_dedup = sherpa_policies_ir_id.drop_duplicates(subset='journal')
-sherpa_policies_ir_dedup
-
-
-# In[50]:
-
-
-# ajout de la ctégorie green (2)
-sherpa_policies_ir_dedup['oa_status'] = 2
-sherpa_policies_ir_dedup
-
-
-# In[51]:
-
-
-# merge avec les revues
-sherpa_policies_ir_dedup = sherpa_policies_ir_dedup.rename(columns={'journal' : 'id'})
-journals_export = pd.merge(journals_export, sherpa_policies_ir_dedup, on='id', how='left')
-journals_export
-
-
-# In[52]:
-
-
-# choix de la catégorie OA
-journals_export['oa_status'] = journals_export['oa_status_x']
-journals_export.loc[(journals_export['oa_status_x'] == 1) & (journals_export['oa_status_y'].notna()), 'oa_status'] = journals_export['oa_status_y']
-journals_export
-
-
-# In[53]:
-
-
-# 6 : Diamond
-# 5 : Gold
-# 4 : Full
-# 3 : Hybrid
-# 2 : Green
-# 1 : UNKNOWN
-journals_export['oa_status'].value_counts()
-
-
-# In[54]:
-
-
-del journals_export['embargo']
-del journals_export['oa_status_x']
-del journals_export['oa_status_y']
-journals_export
-
-
-# In[55]:
-
-
-journals_export['oa_status'] = journals_export['oa_status'].astype(int)
-journals_export
-
-
-# In[56]:
-
-
-# export csv
-journals_export.to_csv('sample/journal_fin_sherpa.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[57]:
-
-
-# export excel
-journals_export.to_excel('sample/journal_fin_sherpa.xlsx', index=False)
-
-
-# In[58]:
-
-
-# export csv
-sherpa_policies_ir_dedup.to_csv('sample/journal_ir.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[59]:
-
-
-# export excel
-sherpa_policies_ir_dedup.to_excel('sample/journal_ir.xlsx', index=False)
-
-
-# In[ ]:
-
-
-
-
diff --git a/import_scripts/07_oacct_sherpa_publishers.md b/import_scripts/07_oacct_sherpa_publishers.md
deleted file mode 100644
index 2a7ef957..00000000
--- a/import_scripts/07_oacct_sherpa_publishers.md
+++ /dev/null
@@ -1,4401 +0,0 @@
-# Projet Open Access Compliance Check Tool (OACCT)
-
-Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-
-Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-
-Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-Date de dernière mise à jour : 16.07.2021
-
-## Table Journals Publishers : ajout des informations de Sherpa
-
-
-```python
-import pandas as pd
-import csv
-import json
-import numpy as np
-```
-
-
-```python
-publishers_issn = pd.read_csv('sample/publishers_brut.tsv', encoding='utf-8', header=0, sep='\t')
-publishers_issn
-```
-
-
-
-
-
-
-
-
-
-
- publisher_id
- name
- id
-
-
-
-
- 0
- Revue_Médicale_Suisse
- Revue Médicale Suisse
- 1
-
-
- 1
- American_Physical_Society
- American Physical Society
- 2
-
-
- 2
- Public_Library_of_Science
- Public Library of Science
- 3
-
-
- 3
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
- 4
-
-
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 376
- Tipografia_La_Commerciale
- Tipografia La Commerciale
- 377
-
-
- 377
- Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 378
-
-
- 378
- Excerpta_Medica
- Excerpta Medica
- 379
-
-
- 379
- Generative_Grammar_Group_of_the_Department_of_...
- Generative Grammar Group of the Department of ...
- 380
-
-
- 380
- 999999
- UNKNOWN
- 999999
-
-
-
-
381 rows × 3 columns
-
-
-
-
-
-```python
-# import ids
-publisher_ids = pd.read_csv('sample/journals_publishers_ids.tsv', encoding='utf-8', header=0, sep='\t')
-publisher_ids
-```
-
-
-
-
-
-
-
-
-
-
- id
- publisher
-
-
-
-
- 0
- 1
- 1
-
-
- 1
- 2
- 2
-
-
- 2
- 3
- 3
-
-
- 3
- 4
- 4
-
-
- 4
- 4
- 5
-
-
- ...
- ...
- ...
-
-
- 940
- 997
- 47
-
-
- 941
- 998
- 75
-
-
- 942
- 999
- 8
-
-
- 943
- 1000
- 119
-
-
- 944
- 1001
- 217
-
-
-
-
945 rows × 2 columns
-
-
-
-
-
-```python
-# renommage id
-publisher_ids = publisher_ids.rename(columns = {'id': 'journal'})
-publisher_ids = publisher_ids.rename(columns = {'publisher': 'id'})
-```
-
-
-```python
-# dédoublonage par publisher id
-publisher_ids_dedup = publisher_ids.drop_duplicates(subset='id')
-publisher_ids_dedup
-```
-
-
-
-
-
-
-
-
-
-
- journal
- id
-
-
-
-
- 0
- 1
- 1
-
-
- 1
- 2
- 2
-
-
- 2
- 3
- 3
-
-
- 3
- 4
- 4
-
-
- 4
- 4
- 5
-
-
- ...
- ...
- ...
-
-
- 929
- 987
- 376
-
-
- 930
- 987
- 377
-
-
- 932
- 989
- 378
-
-
- 934
- 991
- 379
-
-
- 937
- 994
- 380
-
-
-
-
380 rows × 2 columns
-
-
-
-
-
-```python
-# merge avec journals
-publisher = pd.merge(publishers_issn, publisher_ids_dedup, on='id', how='left')
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- publisher_id
- name
- id
- journal
-
-
-
-
- 0
- Revue_Médicale_Suisse
- Revue Médicale Suisse
- 1
- 1.0
-
-
- 1
- American_Physical_Society
- American Physical Society
- 2
- 2.0
-
-
- 2
- Public_Library_of_Science
- Public Library of Science
- 3
- 3.0
-
-
- 3
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
- 4
- 4.0
-
-
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
- 5
- 4.0
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 376
- Tipografia_La_Commerciale
- Tipografia La Commerciale
- 377
- 987.0
-
-
- 377
- Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 378
- 989.0
-
-
- 378
- Excerpta_Medica
- Excerpta Medica
- 379
- 991.0
-
-
- 379
- Generative_Grammar_Group_of_the_Department_of_...
- Generative Grammar Group of the Department of ...
- 380
- 994.0
-
-
- 380
- 999999
- UNKNOWN
- 999999
- NaN
-
-
-
-
381 rows × 4 columns
-
-
-
-
-
-```python
-# ajout des valeurs de sherpa
-publisher_sherpa = pd.read_csv('sample/publisher_sherpa.tsv', encoding='utf-8', header=0, sep='\t')
-publisher_sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_id
- name
- country
- type
- url
-
-
-
-
- 0
- 532
- 45
- John Wiley and Sons
- gb
- former_publisher
- http://www.wiley.com/
-
-
- 1
- 498
- 4
- American Chemical Society
- us
- society_publisher
- http://pubs.acs.org/
-
-
- 2
- 789
- 126
- Acoustical Society of America
- us
- society_publisher
- http://acousticalsociety.org/
-
-
- 3
- 166
- 3291
- Springer
- gb
- commercial_publisher
- https://www.springernature.com/gp/products/jou...
-
-
- 4
- 807
- 3291
- Springer
- gb
- commercial_publisher
- https://www.springernature.com/gp/products/jou...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 803
- 870
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 804
- 41
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 805
- 80
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 806
- 533
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 807
- 608
- 10
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
-
-
808 rows × 6 columns
-
-
-
-
-
-```python
-# renommage ids
-publisher_sherpa = publisher_sherpa.rename(columns = {'publisher_id': 'publisher_id_sherpa', 'url': 'website_sherpa', 'country': 'iso_code'})
-```
-
-
-```python
-# merge avec ids journals
-publisher = pd.merge(publisher, publisher_sherpa, on='journal', how='left')
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- publisher_id
- name_x
- id
- journal
- publisher_id_sherpa
- name_y
- iso_code
- type
- website_sherpa
-
-
-
-
- 0
- Revue_Médicale_Suisse
- Revue Médicale Suisse
- 1
- 1.0
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 1
- American_Physical_Society
- American Physical Society
- 2
- 2.0
- 10.0
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
-
-
- 2
- Public_Library_of_Science
- Public Library of Science
- 3
- 3.0
- 112.0
- Public Library of Science
- us
- commercial_publisher
- http://www.plos.org/
-
-
- 3
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
- 4
- 4.0
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
- 5
- 4.0
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 376
- Tipografia_La_Commerciale
- Tipografia La Commerciale
- 377
- 987.0
- 3291.0
- Springer
- gb
- commercial_publisher
- https://www.springernature.com/gp/products/jou...
-
-
- 377
- Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 378
- 989.0
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 378
- Excerpta_Medica
- Excerpta Medica
- 379
- 991.0
- 30.0
- Elsevier
- us
- commercial_publisher
- http://www.elsevier.com/
-
-
- 379
- Generative_Grammar_Group_of_the_Department_of_...
- Generative Grammar Group of the Department of ...
- 380
- 994.0
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 380
- 999999
- UNKNOWN
- 999999
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
-
-
381 rows × 9 columns
-
-
-
-
-
-```python
-# renommage names
-publisher = publisher.rename(columns = {'name_x': 'name_issn', 'name_y': 'name_sherpa'})
-```
-
-
-```python
-# ajout des informations à partir des revues
-publisher_journals = pd.read_csv('sample/journals_publishers_brut.tsv', encoding='utf-8', header=0, sep='\t', usecols=['id', 'url'])
-publisher_journals
-```
-
-
-
-
-
-
-
-
-
-
- id
- url
-
-
-
-
- 0
- 1
- NaN
-
-
- 1
- 2
- http://prl.aps.org/
-
-
- 2
- 3
- http://www.plosone.org/
-
-
- 3
- 4
- NaN
-
-
- 4
- 5
- http://ojps.aip.org/prbo/
-
-
- ...
- ...
- ...
-
-
- 906
- 997
- NaN
-
-
- 907
- 998
- http://www.jpedsurg.org
-
-
- 908
- 999
- http://www.springerlink.com/content/100451
-
-
- 909
- 1000
- NaN
-
-
- 910
- 1001
- https://www.physiology.org/journal/jappl
-
-
-
-
911 rows × 2 columns
-
-
-
-
-
-```python
-# renommage id
-publisher_journals = publisher_journals.rename(columns = {'id': 'journal'})
-```
-
-
-```python
-# merge avec ids journals
-publisher = pd.merge(publisher, publisher_journals, on='journal', how='left')
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- publisher_id
- name_issn
- id
- journal
- publisher_id_sherpa
- name_sherpa
- iso_code
- type
- website_sherpa
- url
-
-
-
-
- 0
- Revue_Médicale_Suisse
- Revue Médicale Suisse
- 1
- 1.0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 1
- American_Physical_Society
- American Physical Society
- 2
- 2.0
- 10.0
- American Physical Society
- us
- society_publisher
- http://www.aps.org/
- http://prl.aps.org/
-
-
- 2
- Public_Library_of_Science
- Public Library of Science
- 3
- 3.0
- 112.0
- Public Library of Science
- us
- commercial_publisher
- http://www.plos.org/
- http://www.plosone.org/
-
-
- 3
- The_Global_Studies_Institute_de_l’Université_d...
- The Global Studies Institute de l’Université d...
- 4
- 4.0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 4
- Universitat_de_València,_Departamento_de_Teorí...
- Universitat de València, Departamento de Teorí...
- 5
- 4.0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 376
- Tipografia_La_Commerciale
- Tipografia La Commerciale
- 377
- 987.0
- 3291.0
- Springer
- gb
- commercial_publisher
- https://www.springernature.com/gp/products/jou...
- NaN
-
-
- 377
- Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 378
- 989.0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 378
- Excerpta_Medica
- Excerpta Medica
- 379
- 991.0
- 30.0
- Elsevier
- us
- commercial_publisher
- http://www.elsevier.com/
- NaN
-
-
- 379
- Generative_Grammar_Group_of_the_Department_of_...
- Generative Grammar Group of the Department of ...
- 380
- 994.0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
- 380
- 999999
- UNKNOWN
- 999999
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
-
-
381 rows × 10 columns
-
-
-
-
-
-```python
-# renommage names
-del publisher['publisher_id']
-del publisher['publisher_id_sherpa']
-del publisher['type']
-publisher = publisher.rename(columns = {'url' : 'website_issn_journal'})
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- name_issn
- id
- journal
- name_sherpa
- iso_code
- website_sherpa
- website_issn_journal
-
-
-
-
- 0
- Revue Médicale Suisse
- 1
- 1.0
- NaN
- NaN
- NaN
- NaN
-
-
- 1
- American Physical Society
- 2
- 2.0
- American Physical Society
- us
- http://www.aps.org/
- http://prl.aps.org/
-
-
- 2
- Public Library of Science
- 3
- 3.0
- Public Library of Science
- us
- http://www.plos.org/
- http://www.plosone.org/
-
-
- 3
- The Global Studies Institute de l’Université d...
- 4
- 4.0
- NaN
- NaN
- NaN
- NaN
-
-
- 4
- Universitat de València, Departamento de Teorí...
- 5
- 4.0
- NaN
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 376
- Tipografia La Commerciale
- 377
- 987.0
- Springer
- gb
- https://www.springernature.com/gp/products/jou...
- NaN
-
-
- 377
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 378
- 989.0
- NaN
- NaN
- NaN
- NaN
-
-
- 378
- Excerpta Medica
- 379
- 991.0
- Elsevier
- us
- http://www.elsevier.com/
- NaN
-
-
- 379
- Generative Grammar Group of the Department of ...
- 380
- 994.0
- NaN
- NaN
- NaN
- NaN
-
-
- 380
- UNKNOWN
- 999999
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
-
-
381 rows × 7 columns
-
-
-
-
-
-```python
-# ajout des champs vides des vides et int
-publisher['city'] = ''
-publisher['state'] = ''
-publisher['oa_policies'] = ''
-publisher['starting_year'] = 0
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- name_issn
- id
- journal
- name_sherpa
- iso_code
- website_sherpa
- website_issn_journal
- city
- state
- oa_policies
- starting_year
-
-
-
-
- 0
- Revue Médicale Suisse
- 1
- 1.0
- NaN
- NaN
- NaN
- NaN
-
-
-
- 0
-
-
- 1
- American Physical Society
- 2
- 2.0
- American Physical Society
- us
- http://www.aps.org/
- http://prl.aps.org/
-
-
-
- 0
-
-
- 2
- Public Library of Science
- 3
- 3.0
- Public Library of Science
- us
- http://www.plos.org/
- http://www.plosone.org/
-
-
-
- 0
-
-
- 3
- The Global Studies Institute de l’Université d...
- 4
- 4.0
- NaN
- NaN
- NaN
- NaN
-
-
-
- 0
-
-
- 4
- Universitat de València, Departamento de Teorí...
- 5
- 4.0
- NaN
- NaN
- NaN
- NaN
-
-
-
- 0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 376
- Tipografia La Commerciale
- 377
- 987.0
- Springer
- gb
- https://www.springernature.com/gp/products/jou...
- NaN
-
-
-
- 0
-
-
- 377
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 378
- 989.0
- NaN
- NaN
- NaN
- NaN
-
-
-
- 0
-
-
- 378
- Excerpta Medica
- 379
- 991.0
- Elsevier
- us
- http://www.elsevier.com/
- NaN
-
-
-
- 0
-
-
- 379
- Generative Grammar Group of the Department of ...
- 380
- 994.0
- NaN
- NaN
- NaN
- NaN
-
-
-
- 0
-
-
- 380
- UNKNOWN
- 999999
- NaN
- NaN
- NaN
- NaN
- NaN
-
-
-
- 0
-
-
-
-
381 rows × 11 columns
-
-
-
-
-
-```python
-# iso_code en majuscules
-publisher['iso_code'] = publisher['iso_code'].str.upper()
-# ajout de la valeur pour unknown
-publisher['iso_code'] = publisher['iso_code'].fillna('__')
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- name_issn
- id
- journal
- name_sherpa
- iso_code
- website_sherpa
- website_issn_journal
- city
- state
- oa_policies
- starting_year
-
-
-
-
- 0
- Revue Médicale Suisse
- 1
- 1.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
-
-
- 1
- American Physical Society
- 2
- 2.0
- American Physical Society
- US
- http://www.aps.org/
- http://prl.aps.org/
-
-
-
- 0
-
-
- 2
- Public Library of Science
- 3
- 3.0
- Public Library of Science
- US
- http://www.plos.org/
- http://www.plosone.org/
-
-
-
- 0
-
-
- 3
- The Global Studies Institute de l’Université d...
- 4
- 4.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
-
-
- 4
- Universitat de València, Departamento de Teorí...
- 5
- 4.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 376
- Tipografia La Commerciale
- 377
- 987.0
- Springer
- GB
- https://www.springernature.com/gp/products/jou...
- NaN
-
-
-
- 0
-
-
- 377
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 378
- 989.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
-
-
- 378
- Excerpta Medica
- 379
- 991.0
- Elsevier
- US
- http://www.elsevier.com/
- NaN
-
-
-
- 0
-
-
- 379
- Generative Grammar Group of the Department of ...
- 380
- 994.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
-
-
- 380
- UNKNOWN
- 999999
- NaN
- NaN
- __
- NaN
- NaN
-
-
-
- 0
-
-
-
-
381 rows × 11 columns
-
-
-
-
-
-```python
-# merge avec countries
-country = pd.read_csv('sample/country.tsv', usecols=('iso_code', 'id'), encoding='utf-8', header=0, sep='\t')
-country
-```
-
-
-
-
-
-
-
-
-
-
- iso_code
- id
-
-
-
-
- 0
- AF
- 1
-
-
- 1
- AL
- 2
-
-
- 2
- DZ
- 3
-
-
- 3
- AS
- 4
-
-
- 4
- AD
- 5
-
-
- ...
- ...
- ...
-
-
- 246
- ZM
- 247
-
-
- 247
- ZW
- 248
-
-
- 248
- AX
- 249
-
-
- 249
- OI
- 250
-
-
- 250
- __
- 999999
-
-
-
-
251 rows × 2 columns
-
-
-
-
-
-```python
-country = country.rename(columns={'id': 'country'})
-country
-```
-
-
-
-
-
-
-
-
-
-
- iso_code
- country
-
-
-
-
- 0
- AF
- 1
-
-
- 1
- AL
- 2
-
-
- 2
- DZ
- 3
-
-
- 3
- AS
- 4
-
-
- 4
- AD
- 5
-
-
- ...
- ...
- ...
-
-
- 246
- ZM
- 247
-
-
- 247
- ZW
- 248
-
-
- 248
- AX
- 249
-
-
- 249
- OI
- 250
-
-
- 250
- __
- 999999
-
-
-
-
251 rows × 2 columns
-
-
-
-
-
-```python
-publisher = pd.merge(publisher, country, on='iso_code', how='left')
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- name_issn
- id
- journal
- name_sherpa
- iso_code
- website_sherpa
- website_issn_journal
- city
- state
- oa_policies
- starting_year
- country
-
-
-
-
- 0
- Revue Médicale Suisse
- 1
- 1.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
-
-
- 1
- American Physical Society
- 2
- 2.0
- American Physical Society
- US
- http://www.aps.org/
- http://prl.aps.org/
-
-
-
- 0
- 236
-
-
- 2
- Public Library of Science
- 3
- 3.0
- Public Library of Science
- US
- http://www.plos.org/
- http://www.plosone.org/
-
-
-
- 0
- 236
-
-
- 3
- The Global Studies Institute de l’Université d...
- 4
- 4.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
-
-
- 4
- Universitat de València, Departamento de Teorí...
- 5
- 4.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 376
- Tipografia La Commerciale
- 377
- 987.0
- Springer
- GB
- https://www.springernature.com/gp/products/jou...
- NaN
-
-
-
- 0
- 234
-
-
- 377
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 378
- 989.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
-
-
- 378
- Excerpta Medica
- 379
- 991.0
- Elsevier
- US
- http://www.elsevier.com/
- NaN
-
-
-
- 0
- 236
-
-
- 379
- Generative Grammar Group of the Department of ...
- 380
- 994.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
-
-
- 380
- UNKNOWN
- 999999
- NaN
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
-
-
-
-
381 rows × 12 columns
-
-
-
-
-
-```python
-# garder sherpa puis issn.org
-publisher.loc[publisher['name_sherpa'].notna(), 'name'] = publisher['name_sherpa']
-publisher.loc[publisher['name_sherpa'].isna(), 'name'] = publisher['name_issn']
-publisher.loc[publisher['website_sherpa'].notna(), 'website'] = publisher['website_sherpa']
-publisher.loc[publisher['website_sherpa'].isna(), 'website'] = publisher['website_issn_journal']
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- name_issn
- id
- journal
- name_sherpa
- iso_code
- website_sherpa
- website_issn_journal
- city
- state
- oa_policies
- starting_year
- country
- name
- website
-
-
-
-
- 0
- Revue Médicale Suisse
- 1
- 1.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
- Revue Médicale Suisse
- NaN
-
-
- 1
- American Physical Society
- 2
- 2.0
- American Physical Society
- US
- http://www.aps.org/
- http://prl.aps.org/
-
-
-
- 0
- 236
- American Physical Society
- http://www.aps.org/
-
-
- 2
- Public Library of Science
- 3
- 3.0
- Public Library of Science
- US
- http://www.plos.org/
- http://www.plosone.org/
-
-
-
- 0
- 236
- Public Library of Science
- http://www.plos.org/
-
-
- 3
- The Global Studies Institute de l’Université d...
- 4
- 4.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
- The Global Studies Institute de l’Université d...
- NaN
-
-
- 4
- Universitat de València, Departamento de Teorí...
- 5
- 4.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
- Universitat de València, Departamento de Teorí...
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 376
- Tipografia La Commerciale
- 377
- 987.0
- Springer
- GB
- https://www.springernature.com/gp/products/jou...
- NaN
-
-
-
- 0
- 234
- Springer
- https://www.springernature.com/gp/products/jou...
-
-
- 377
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 378
- 989.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- NaN
-
-
- 378
- Excerpta Medica
- 379
- 991.0
- Elsevier
- US
- http://www.elsevier.com/
- NaN
-
-
-
- 0
- 236
- Elsevier
- http://www.elsevier.com/
-
-
- 379
- Generative Grammar Group of the Department of ...
- 380
- 994.0
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
- Generative Grammar Group of the Department of ...
- NaN
-
-
- 380
- UNKNOWN
- 999999
- NaN
- NaN
- __
- NaN
- NaN
-
-
-
- 0
- 999999
- UNKNOWN
- NaN
-
-
-
-
381 rows × 14 columns
-
-
-
-
-
-```python
-# garder les champs utiles pour l'éditeur
-publisher_export = publisher[['id', 'name', 'country', 'city', 'state', 'starting_year', 'website', 'oa_policies']]
-```
-
-
-```python
-# supprimer les doublons
-publisher_export = publisher_export.drop_duplicates(subset='id')
-publisher_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- country
- city
- state
- starting_year
- website
- oa_policies
-
-
-
-
- 0
- 1
- Revue Médicale Suisse
- 999999
-
-
- 0
- NaN
-
-
-
- 1
- 2
- American Physical Society
- 236
-
-
- 0
- http://www.aps.org/
-
-
-
- 2
- 3
- Public Library of Science
- 236
-
-
- 0
- http://www.plos.org/
-
-
-
- 3
- 4
- The Global Studies Institute de l’Université d...
- 999999
-
-
- 0
- NaN
-
-
-
- 4
- 5
- Universitat de València, Departamento de Teorí...
- 999999
-
-
- 0
- NaN
-
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 376
- 377
- Springer
- 234
-
-
- 0
- https://www.springernature.com/gp/products/jou...
-
-
-
- 377
- 378
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 999999
-
-
- 0
- NaN
-
-
-
- 378
- 379
- Elsevier
- 236
-
-
- 0
- http://www.elsevier.com/
-
-
-
- 379
- 380
- Generative Grammar Group of the Department of ...
- 999999
-
-
- 0
- NaN
-
-
-
- 380
- 999999
- UNKNOWN
- 999999
-
-
- 0
- NaN
-
-
-
-
-
381 rows × 8 columns
-
-
-
-
-
-```python
-# remplacement des vides et id à int
-publisher_export['website'] = publisher_export['website'].fillna('')
-publisher_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- country
- city
- state
- starting_year
- website
- oa_policies
-
-
-
-
- 0
- 1
- Revue Médicale Suisse
- 999999
-
-
- 0
-
-
-
-
- 1
- 2
- American Physical Society
- 236
-
-
- 0
- http://www.aps.org/
-
-
-
- 2
- 3
- Public Library of Science
- 236
-
-
- 0
- http://www.plos.org/
-
-
-
- 3
- 4
- The Global Studies Institute de l’Université d...
- 999999
-
-
- 0
-
-
-
-
- 4
- 5
- Universitat de València, Departamento de Teorí...
- 999999
-
-
- 0
-
-
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 376
- 377
- Springer
- 234
-
-
- 0
- https://www.springernature.com/gp/products/jou...
-
-
-
- 377
- 378
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 999999
-
-
- 0
-
-
-
-
- 378
- 379
- Elsevier
- 236
-
-
- 0
- http://www.elsevier.com/
-
-
-
- 379
- 380
- Generative Grammar Group of the Department of ...
- 999999
-
-
- 0
-
-
-
-
- 380
- 999999
- UNKNOWN
- 999999
-
-
- 0
-
-
-
-
-
-
381 rows × 8 columns
-
-
-
-
-
-```python
-# merge pour avoir les titres
-publisher_ids_dedup = pd.merge(publisher_ids_dedup, publisher_export[['id', 'name']], on='id', how='left')
-publisher_ids_dedup
-```
-
-
-
-
-
-
-
-
-
-
- journal
- id
- name
-
-
-
-
- 0
- 1
- 1
- Revue Médicale Suisse
-
-
- 1
- 2
- 2
- American Physical Society
-
-
- 2
- 3
- 3
- Public Library of Science
-
-
- 3
- 4
- 4
- The Global Studies Institute de l’Université d...
-
-
- 4
- 4
- 5
- Universitat de València, Departamento de Teorí...
-
-
- ...
- ...
- ...
- ...
-
-
- 375
- 987
- 376
- Springer
-
-
- 376
- 987
- 377
- Springer
-
-
- 377
- 989
- 378
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
-
-
- 378
- 991
- 379
- Elsevier
-
-
- 379
- 994
- 380
- Generative Grammar Group of the Department of ...
-
-
-
-
380 rows × 3 columns
-
-
-
-
-
-```python
-# garder les ids avant le dédoublonage pour la correction du publisher_ids_dedup
-publisher_ids_dedup = publisher_ids_dedup.rename(columns = {'id': 'publisher_av_dedup'})
-publisher_ids_dedup
-```
-
-
-
-
-
-
-
-
-
-
- journal
- publisher_av_dedup
- name
-
-
-
-
- 0
- 1
- 1
- Revue Médicale Suisse
-
-
- 1
- 2
- 2
- American Physical Society
-
-
- 2
- 3
- 3
- Public Library of Science
-
-
- 3
- 4
- 4
- The Global Studies Institute de l’Université d...
-
-
- 4
- 4
- 5
- Universitat de València, Departamento de Teorí...
-
-
- ...
- ...
- ...
- ...
-
-
- 375
- 987
- 376
- Springer
-
-
- 376
- 987
- 377
- Springer
-
-
- 377
- 989
- 378
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
-
-
- 378
- 991
- 379
- Elsevier
-
-
- 379
- 994
- 380
- Generative Grammar Group of the Department of ...
-
-
-
-
380 rows × 3 columns
-
-
-
-
-
-```python
-publisher_export_dedup = publisher_export.drop_duplicates(subset='name')
-publisher_export_dedup
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- country
- city
- state
- starting_year
- website
- oa_policies
-
-
-
-
- 0
- 1
- Revue Médicale Suisse
- 999999
-
-
- 0
-
-
-
-
- 1
- 2
- American Physical Society
- 236
-
-
- 0
- http://www.aps.org/
-
-
-
- 2
- 3
- Public Library of Science
- 236
-
-
- 0
- http://www.plos.org/
-
-
-
- 3
- 4
- The Global Studies Institute de l’Université d...
- 999999
-
-
- 0
-
-
-
-
- 4
- 5
- Universitat de València, Departamento de Teorí...
- 999999
-
-
- 0
-
-
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 371
- 372
- [American Medical Association]
- 999999
-
-
- 0
- http://archneur.jamanetwork.com/issues.aspx
-
-
-
- 374
- 375
- Société botanique de Genève
- 999999
-
-
- 0
-
-
-
-
- 377
- 378
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 999999
-
-
- 0
-
-
-
-
- 379
- 380
- Generative Grammar Group of the Department of ...
- 999999
-
-
- 0
-
-
-
-
- 380
- 999999
- UNKNOWN
- 999999
-
-
- 0
-
-
-
-
-
-
196 rows × 8 columns
-
-
-
-
-
-```python
-del publisher_export_dedup['id']
-# convertir l'index en id
-publisher_export_dedup = publisher_export_dedup.reset_index()
-# ajout de l'id avec l'index + 1
-publisher_export_dedup['id'] = publisher_export_dedup['index'] + 1
-del publisher_export_dedup['index']
-publisher_export_dedup
-```
-
-
-
-
-
-
-
-
-
-
- name
- country
- city
- state
- starting_year
- website
- oa_policies
- id
-
-
-
-
- 0
- Revue Médicale Suisse
- 999999
-
-
- 0
-
-
- 1
-
-
- 1
- American Physical Society
- 236
-
-
- 0
- http://www.aps.org/
-
- 2
-
-
- 2
- Public Library of Science
- 236
-
-
- 0
- http://www.plos.org/
-
- 3
-
-
- 3
- The Global Studies Institute de l’Université d...
- 999999
-
-
- 0
-
-
- 4
-
-
- 4
- Universitat de València, Departamento de Teorí...
- 999999
-
-
- 0
-
-
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 191
- [American Medical Association]
- 999999
-
-
- 0
- http://archneur.jamanetwork.com/issues.aspx
-
- 372
-
-
- 192
- Société botanique de Genève
- 999999
-
-
- 0
-
-
- 375
-
-
- 193
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 999999
-
-
- 0
-
-
- 378
-
-
- 194
- Generative Grammar Group of the Department of ...
- 999999
-
-
- 0
-
-
- 380
-
-
- 195
- UNKNOWN
- 999999
-
-
- 0
-
-
- 381
-
-
-
-
196 rows × 8 columns
-
-
-
-
-
-```python
-del publisher_export_dedup['id']
-# convertir l'index en id
-publisher_export_dedup = publisher_export_dedup.reset_index()
-# ajout de l'id avec l'index + 1
-publisher_export_dedup['id'] = publisher_export_dedup['index'] + 1
-del publisher_export_dedup['index']
-publisher_export_dedup
-```
-
-
-
-
-
-
-
-
-
-
- name
- country
- city
- state
- starting_year
- website
- oa_policies
- id
-
-
-
-
- 0
- Revue Médicale Suisse
- 999999
-
-
- 0
-
-
- 1
-
-
- 1
- American Physical Society
- 236
-
-
- 0
- http://www.aps.org/
-
- 2
-
-
- 2
- Public Library of Science
- 236
-
-
- 0
- http://www.plos.org/
-
- 3
-
-
- 3
- The Global Studies Institute de l’Université d...
- 999999
-
-
- 0
-
-
- 4
-
-
- 4
- Universitat de València, Departamento de Teorí...
- 999999
-
-
- 0
-
-
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 191
- [American Medical Association]
- 999999
-
-
- 0
- http://archneur.jamanetwork.com/issues.aspx
-
- 192
-
-
- 192
- Société botanique de Genève
- 999999
-
-
- 0
-
-
- 193
-
-
- 193
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 999999
-
-
- 0
-
-
- 194
-
-
- 194
- Generative Grammar Group of the Department of ...
- 999999
-
-
- 0
-
-
- 195
-
-
- 195
- UNKNOWN
- 999999
-
-
- 0
-
-
- 196
-
-
-
-
196 rows × 8 columns
-
-
-
-
-
-```python
-# merge avec les ids d'avant Sherpa
-publisher_ids_dedup = pd.merge(publisher_ids_dedup, publisher_export_dedup[['id', 'name']], on='name', how='left')
-publisher_ids_dedup = publisher_ids_dedup.rename(columns = {'id': 'publisher'})
-publisher_ids_dedup = publisher_ids_dedup.rename(columns = {'journal': 'id'})
-publisher_ids_dedup
-```
-
-
-
-
-
-
-
-
-
-
- id
- publisher_av_dedup
- name
- publisher
-
-
-
-
- 0
- 1
- 1
- Revue Médicale Suisse
- 1
-
-
- 1
- 2
- 2
- American Physical Society
- 2
-
-
- 2
- 3
- 3
- Public Library of Science
- 3
-
-
- 3
- 4
- 4
- The Global Studies Institute de l’Université d...
- 4
-
-
- 4
- 4
- 5
- Universitat de València, Departamento de Teorí...
- 5
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 375
- 987
- 376
- Springer
- 45
-
-
- 376
- 987
- 377
- Springer
- 45
-
-
- 377
- 989
- 378
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 194
-
-
- 378
- 991
- 379
- Elsevier
- 11
-
-
- 379
- 994
- 380
- Generative Grammar Group of the Department of ...
- 195
-
-
-
-
380 rows × 4 columns
-
-
-
-
-
-```python
-# concat valeurs avec même id
-del publisher_ids_dedup['publisher_av_dedup']
-del publisher_ids_dedup['name']
-publisher_ids_dedup['publisher'] = publisher_ids_dedup['publisher'].astype(str)
-publisher_ids_dedup_grouped = publisher_ids_dedup.groupby('id').agg({'publisher': lambda x: ', '.join(x)})
-publisher_ids_dedup_grouped
-```
-
-
-
-
-
-
-
-
-
-
- publisher
-
-
- id
-
-
-
-
-
- 1
- 1
-
-
- 2
- 2
-
-
- 3
- 3
-
-
- 4
- 4, 5
-
-
- 5
- 2
-
-
- ...
- ...
-
-
- 986
- 193
-
-
- 987
- 45, 45
-
-
- 989
- 194
-
-
- 991
- 11
-
-
- 994
- 195
-
-
-
-
366 rows × 1 columns
-
-
-
-
-
-```python
-# modifs dans les journaux
-journal = pd.read_csv('sample/journal_fin_sherpa.tsv', encoding='utf-8', header=0, sep='\t')
-journal
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- publisher
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
- oa_status
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
- NaN
- 215
- 138
- 1
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 2
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
-
-
- 2
- 3
- PLoS ONE
- NaN
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 3
- 1
- 1
- 1
- 0
- 0
- 4.035714
- 5
-
-
- 3
- 4
- EU-topías
- EU-topías
- 2011
- 9999
- NaN
- 209
- 124, 138, 402, 292
- 4, 5
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
-
-
- 4
- 5
- Physical review B: Condensed matter and materi...
- Phys. rev., B, Condens. matter mater. phys.
- 1998
- 2015
- http://journals.aps.org/prb/
- 236
- 124
- 6
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- Smart Materials and Structures
- Smart mater. struct. (Print)
- 1992
- 9999
- http://iopscience.iop.org/0964-1726
- 234
- 124
- 47
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
-
-
- 907
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 75
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
-
-
- 908
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 8
- 0
- 0
- 1
- 1
- 1
- NaN
- 2
-
-
- 909
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 119
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
-
-
- 910
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 217
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
-
-
-
-
911 rows × 16 columns
-
-
-
-
-
-```python
-# merge avec les journaux journal_fin_sherpa
-journal = pd.merge(journal, publisher_ids_dedup_grouped, on='id', how='left')
-journal
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- publisher_x
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
- oa_status
- publisher_y
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
- NaN
- 215
- 138
- 1
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
- 1
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 2
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
- 2
-
-
- 2
- 3
- PLoS ONE
- NaN
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 3
- 1
- 1
- 1
- 0
- 0
- 4.035714
- 5
- 3
-
-
- 3
- 4
- EU-topías
- EU-topías
- 2011
- 9999
- NaN
- 209
- 124, 138, 402, 292
- 4, 5
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
- 4, 5
-
-
- 4
- 5
- Physical review B: Condensed matter and materi...
- Phys. rev., B, Condens. matter mater. phys.
- 1998
- 2015
- http://journals.aps.org/prb/
- 236
- 124
- 6
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
- 2
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- Smart Materials and Structures
- Smart mater. struct. (Print)
- 1992
- 9999
- http://iopscience.iop.org/0964-1726
- 234
- 124
- 47
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
- NaN
-
-
- 907
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 75
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
- NaN
-
-
- 908
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 8
- 0
- 0
- 1
- 1
- 1
- NaN
- 2
- NaN
-
-
- 909
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 119
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
- NaN
-
-
- 910
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 217
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
- NaN
-
-
-
-
911 rows × 17 columns
-
-
-
-
-
-```python
-del journal['publisher_x']
-journal = journal.rename(columns = {'publisher_y': 'publisher'})
-journal
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- name_short_iso_4
- starting_year
- end_year
- website
- country
- language
- doaj_seal
- doaj_status
- lockss
- portico
- nlch
- qoam_av_score
- oa_status
- publisher
-
-
-
-
- 0
- 1
- Revue médicale suisse
- Rev. méd. suisse
- 2005
- 9999
- NaN
- 215
- 138
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
- 1
-
-
- 1
- 2
- Physical Review Letters
- Phys. rev. lett. (Print)
- 1958
- 9999
- http://prl.aps.org/
- 236
- 124
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
- 2
-
-
- 2
- 3
- PLoS ONE
- NaN
- 2006
- 9999
- http://www.plosone.org/
- 236
- 124
- 1
- 1
- 1
- 0
- 0
- 4.035714
- 5
- 3
-
-
- 3
- 4
- EU-topías
- EU-topías
- 2011
- 9999
- NaN
- 209
- 124, 138, 402, 292
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
- 4, 5
-
-
- 4
- 5
- Physical review B: Condensed matter and materi...
- Phys. rev., B, Condens. matter mater. phys.
- 1998
- 2015
- http://journals.aps.org/prb/
- 236
- 124
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
- 2
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 906
- 997
- Smart Materials and Structures
- Smart mater. struct. (Print)
- 1992
- 9999
- http://iopscience.iop.org/0964-1726
- 234
- 124
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
- NaN
-
-
- 907
- 998
- Journal of Pediatric Surgery
- J. pediatr. surg. (Print)
- 1966
- 9999
- http://www.jpedsurg.org/
- 236
- 124
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
- NaN
-
-
- 908
- 999
- Probability Theory and Related Fields
- Probab. theory relat. fields (Internet)
- uuuu
- 9999
- http://www.springerlink.com/content/100451/?p=...
- 83
- 124
- 0
- 0
- 1
- 1
- 1
- NaN
- 2
- NaN
-
-
- 909
- 1000
- Renewable Energy
- Renew. energy
- 1991
- 9999
- http://www.elsevier.com/wps/product/cws_home/9...
- 234
- 124
- 0
- 0
- 0
- 1
- 0
- NaN
- 2
- NaN
-
-
- 910
- 1001
- Journal of applied physiology: respiratory, en...
- J. appl. physiol.: respir., environ. exercise ...
- 1977
- 1984
- https://www.physiology.org/journal/jappl
- 236
- 124
- 0
- 0
- 0
- 0
- 0
- NaN
- 1
- NaN
-
-
-
-
911 rows × 16 columns
-
-
-
-
-
-```python
-# esport JSON publisher
-result = journal.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/journal.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-journal.to_csv('sample/journal.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-journal.to_excel('sample/journal.xlsx', index=False)
-```
-
-
-```python
-# esport JSON publisher
-result = publisher_export_dedup.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/publisher.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-publisher_export_dedup.to_csv('sample/publisher.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-publisher_export_dedup.to_excel('sample/publisher.xlsx', index=False)
-```
-
-
-```python
-
-```
diff --git a/import_scripts/07_oacct_sherpa_publishers.py b/import_scripts/07_oacct_sherpa_publishers.py
deleted file mode 100644
index 29af8f7e..00000000
--- a/import_scripts/07_oacct_sherpa_publishers.py
+++ /dev/null
@@ -1,348 +0,0 @@
-#!/usr/bin/env python
-# coding: utf-8
-
-# # Projet Open Access Compliance Check Tool (OACCT)
-#
-# Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-#
-# Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-#
-# Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-# Date de dernière mise à jour : 16.07.2021
-
-# ## Table Journals Publishers : ajout des informations de Sherpa
-
-# In[1]:
-
-
-import pandas as pd
-import csv
-import json
-import numpy as np
-
-
-# In[2]:
-
-
-publishers_issn = pd.read_csv('sample/publishers_brut.tsv', encoding='utf-8', header=0, sep='\t')
-publishers_issn
-
-
-# In[3]:
-
-
-# import ids
-publisher_ids = pd.read_csv('sample/journals_publishers_ids.tsv', encoding='utf-8', header=0, sep='\t')
-publisher_ids
-
-
-# In[4]:
-
-
-# renommage id
-publisher_ids = publisher_ids.rename(columns = {'id': 'journal'})
-publisher_ids = publisher_ids.rename(columns = {'publisher': 'id'})
-
-
-# In[5]:
-
-
-# dédoublonage par publisher id
-publisher_ids_dedup = publisher_ids.drop_duplicates(subset='id')
-publisher_ids_dedup
-
-
-# In[6]:
-
-
-# merge avec journals
-publisher = pd.merge(publishers_issn, publisher_ids_dedup, on='id', how='left')
-publisher
-
-
-# In[7]:
-
-
-# ajout des valeurs de sherpa
-publisher_sherpa = pd.read_csv('sample/publisher_sherpa.tsv', encoding='utf-8', header=0, sep='\t')
-publisher_sherpa
-
-
-# In[8]:
-
-
-# renommage ids
-publisher_sherpa = publisher_sherpa.rename(columns = {'publisher_id': 'publisher_id_sherpa', 'url': 'website_sherpa', 'country': 'iso_code'})
-
-
-# In[9]:
-
-
-# merge avec ids journals
-publisher = pd.merge(publisher, publisher_sherpa, on='journal', how='left')
-publisher
-
-
-# In[10]:
-
-
-# renommage names
-publisher = publisher.rename(columns = {'name_x': 'name_issn', 'name_y': 'name_sherpa'})
-
-
-# In[11]:
-
-
-# ajout des informations à partir des revues
-publisher_journals = pd.read_csv('sample/journals_publishers_brut.tsv', encoding='utf-8', header=0, sep='\t', usecols=['id', 'url'])
-publisher_journals
-
-
-# In[12]:
-
-
-# renommage id
-publisher_journals = publisher_journals.rename(columns = {'id': 'journal'})
-
-
-# In[13]:
-
-
-# merge avec ids journals
-publisher = pd.merge(publisher, publisher_journals, on='journal', how='left')
-publisher
-
-
-# In[14]:
-
-
-# renommage names
-del publisher['publisher_id']
-del publisher['publisher_id_sherpa']
-del publisher['type']
-publisher = publisher.rename(columns = {'url' : 'website_issn_journal'})
-publisher
-
-
-# In[15]:
-
-
-# ajout des champs vides des vides et int
-publisher['city'] = ''
-publisher['state'] = ''
-publisher['oa_policies'] = ''
-publisher['starting_year'] = 0
-publisher
-
-
-# In[16]:
-
-
-# iso_code en majuscules
-publisher['iso_code'] = publisher['iso_code'].str.upper()
-# ajout de la valeur pour unknown
-publisher['iso_code'] = publisher['iso_code'].fillna('__')
-publisher
-
-
-# In[17]:
-
-
-# merge avec countries
-country = pd.read_csv('sample/country.tsv', usecols=('iso_code', 'id'), encoding='utf-8', header=0, sep='\t')
-country
-
-
-# In[18]:
-
-
-country = country.rename(columns={'id': 'country'})
-country
-
-
-# In[19]:
-
-
-publisher = pd.merge(publisher, country, on='iso_code', how='left')
-publisher
-
-
-# In[20]:
-
-
-# garder sherpa puis issn.org
-publisher.loc[publisher['name_sherpa'].notna(), 'name'] = publisher['name_sherpa']
-publisher.loc[publisher['name_sherpa'].isna(), 'name'] = publisher['name_issn']
-publisher.loc[publisher['website_sherpa'].notna(), 'website'] = publisher['website_sherpa']
-publisher.loc[publisher['website_sherpa'].isna(), 'website'] = publisher['website_issn_journal']
-publisher
-
-
-# In[21]:
-
-
-# garder les champs utiles pour l'éditeur
-publisher_export = publisher[['id', 'name', 'country', 'city', 'state', 'starting_year', 'website', 'oa_policies']]
-
-
-# In[22]:
-
-
-# supprimer les doublons
-publisher_export = publisher_export.drop_duplicates(subset='id')
-publisher_export
-
-
-# In[23]:
-
-
-# remplacement des vides et id à int
-publisher_export['website'] = publisher_export['website'].fillna('')
-publisher_export
-
-
-# In[24]:
-
-
-# merge pour avoir les titres
-publisher_ids_dedup = pd.merge(publisher_ids_dedup, publisher_export[['id', 'name']], on='id', how='left')
-publisher_ids_dedup
-
-
-# In[25]:
-
-
-# garder les ids avant le dédoublonage pour la correction du publisher_ids_dedup
-publisher_ids_dedup = publisher_ids_dedup.rename(columns = {'id': 'publisher_av_dedup'})
-publisher_ids_dedup
-
-
-# In[26]:
-
-
-publisher_export_dedup = publisher_export.drop_duplicates(subset='name')
-publisher_export_dedup
-
-
-# In[27]:
-
-
-del publisher_export_dedup['id']
-# convertir l'index en id
-publisher_export_dedup = publisher_export_dedup.reset_index()
-# ajout de l'id avec l'index + 1
-publisher_export_dedup['id'] = publisher_export_dedup['index'] + 1
-del publisher_export_dedup['index']
-publisher_export_dedup
-
-
-# In[28]:
-
-
-del publisher_export_dedup['id']
-# convertir l'index en id
-publisher_export_dedup = publisher_export_dedup.reset_index()
-# ajout de l'id avec l'index + 1
-publisher_export_dedup['id'] = publisher_export_dedup['index'] + 1
-del publisher_export_dedup['index']
-publisher_export_dedup
-
-
-# In[29]:
-
-
-# merge avec les ids d'avant Sherpa
-publisher_ids_dedup = pd.merge(publisher_ids_dedup, publisher_export_dedup[['id', 'name']], on='name', how='left')
-publisher_ids_dedup = publisher_ids_dedup.rename(columns = {'id': 'publisher'})
-publisher_ids_dedup = publisher_ids_dedup.rename(columns = {'journal': 'id'})
-publisher_ids_dedup
-
-
-# In[30]:
-
-
-# concat valeurs avec même id
-del publisher_ids_dedup['publisher_av_dedup']
-del publisher_ids_dedup['name']
-publisher_ids_dedup['publisher'] = publisher_ids_dedup['publisher'].astype(str)
-publisher_ids_dedup_grouped = publisher_ids_dedup.groupby('id').agg({'publisher': lambda x: ', '.join(x)})
-publisher_ids_dedup_grouped
-
-
-# In[31]:
-
-
-# modifs dans les journaux
-journal = pd.read_csv('sample/journal_fin_sherpa.tsv', encoding='utf-8', header=0, sep='\t')
-journal
-
-
-# In[32]:
-
-
-# merge avec les journaux journal_fin_sherpa
-journal = pd.merge(journal, publisher_ids_dedup_grouped, on='id', how='left')
-journal
-
-
-# In[33]:
-
-
-del journal['publisher_x']
-journal = journal.rename(columns = {'publisher_y': 'publisher'})
-journal
-
-
-# In[34]:
-
-
-# esport JSON publisher
-result = journal.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/journal.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[35]:
-
-
-# export csv
-journal.to_csv('sample/journal.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[36]:
-
-
-# export excel
-journal.to_excel('sample/journal.xlsx', index=False)
-
-
-# In[37]:
-
-
-# esport JSON publisher
-result = publisher_export_dedup.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/publisher.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[38]:
-
-
-# export csv
-publisher_export_dedup.to_csv('sample/publisher.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[39]:
-
-
-# export excel
-publisher_export_dedup.to_excel('sample/publisher.xlsx', index=False)
-
-
-# In[ ]:
-
-
-
-
diff --git a/import_scripts/08_oacct_sherpa_issns.md b/import_scripts/08_oacct_sherpa_issns.md
deleted file mode 100644
index 8989dea2..00000000
--- a/import_scripts/08_oacct_sherpa_issns.md
+++ /dev/null
@@ -1,2204 +0,0 @@
-# Projet Open Access Compliance Check Tool (OACCT)
-
-Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-
-Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-
-Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-Date de dernière mise à jour : 16.07.2021
-
-## Table ISSNs
-
-
-```python
-import pandas as pd
-import csv
-import json
-import numpy as np
-```
-
-
-```python
-issns = pd.read_csv('sample/issn_brut.tsv', encoding='utf-8', sep='\t')
-issns
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
-
-
- 1
- 1399-0039
- 0001-2815
- 532
- NaN
- 3
- 2
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
-
-
- 3
- 1520-4898
- 0001-4842
- 498
- NaN
- 3
- 4
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
-
-
- 1756
- 2470-0053
- 2470-0045
- 533
- NaN
- 3
- 1757
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
-
-
-
-
1760 rows × 6 columns
-
-
-
-
-## Ajout du format à partir de Sherpa
-
-
-```python
-# ajout du format par sherpa
-issn_sherpa = pd.read_csv('sample/issn_sherpa.tsv', encoding='utf-8', sep='\t')
-issn_sherpa
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
- print
-
-
- 1
- 1399-0039
- 0001-2815
- 532
- NaN
- 3
- 2
- electronic
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
- print
-
-
- 3
- 1520-4898
- 0001-4842
- 498
- NaN
- 3
- 4
- electronic
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
- print
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
- print
-
-
- 1756
- 2470-0053
- 2470-0045
- 533
- NaN
- 3
- 1757
- electronic
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
- electronic
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
- NaN
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
- NaN
-
-
-
-
1760 rows × 7 columns
-
-
-
-
-
-```python
-issn_sherpa['type'] = issn_sherpa['type'].str.upper()
-issn_sherpa
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
- PRINT
-
-
- 1
- 1399-0039
- 0001-2815
- 532
- NaN
- 3
- 2
- ELECTRONIC
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
- PRINT
-
-
- 3
- 1520-4898
- 0001-4842
- 498
- NaN
- 3
- 4
- ELECTRONIC
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
- PRINT
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
- PRINT
-
-
- 1756
- 2470-0053
- 2470-0045
- 533
- NaN
- 3
- 1757
- ELECTRONIC
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
- ELECTRONIC
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
- NaN
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
- NaN
-
-
-
-
1760 rows × 7 columns
-
-
-
-
-
-```python
-issns = pd.merge(issns, issn_sherpa[['issn', 'type']], on='issn', how='outer')
-issns
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
- PRINT
-
-
- 1
- 1399-0039
- 0001-2815
- 532
- NaN
- 3
- 2
- ELECTRONIC
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
- PRINT
-
-
- 3
- 1520-4898
- 0001-4842
- 498
- NaN
- 3
- 4
- ELECTRONIC
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
- PRINT
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
- PRINT
-
-
- 1756
- 2470-0053
- 2470-0045
- 533
- NaN
- 3
- 1757
- ELECTRONIC
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
- ELECTRONIC
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
- NaN
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
- NaN
-
-
-
-
1760 rows × 7 columns
-
-
-
-
-
-```python
-issns['format'].value_counts()
-```
-
-
-
-
- PRINT 816
- ELECTRONIC 90
- OTHER 2
- Name: format, dtype: int64
-
-
-
-
-```python
-issns['type'].value_counts()
-```
-
-
-
-
- PRINT 750
- ELECTRONIC 575
- Name: type, dtype: int64
-
-
-
-
-```python
-# tester les lignes sans type
-issns.loc[issns['format'].isnull()].loc[issns['type'].isnull()]
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 5
- 1520-8524
- 0001-4966
- 789
- NaN
- 3
- 6
- NaN
-
-
- 6
- 1520-9024
- 0001-4966
- 789
- NaN
- 3
- 7
- NaN
-
-
- 17
- 1943-2984
- 0002-7863
- 8
- NaN
- 3
- 18
- NaN
-
-
- 23
- 1555-7162
- 0002-9343
- 985
- NaN
- 3
- 24
- NaN
-
-
- 27
- 2163-5773
- 0002-9513
- 787
- NaN
- 3
- 28
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1722
- 2160-9047
- 2160-9020
- 467
- NaN
- 3
- 1723
- NaN
-
-
- 1729
- 2340-115X
- 2174-8454
- 4
- NaN
- 3
- 1730
- NaN
-
-
- 1732
- 2211-3282
- 2211-2855
- 990
- NaN
- 3
- 1733
- NaN
-
-
- 1739
- 2297-7007
- 2297-6981
- 618
- NaN
- 3
- 1740
- NaN
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
- NaN
-
-
-
-
326 rows × 7 columns
-
-
-
-
-
-```python
-# tester les lignes avec type égal
-issns.loc[issns['format'] == issns['type']]
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
- PRINT
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
- PRINT
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
- PRINT
-
-
- 7
- 0001-6268
- 0001-6268
- 166
- PRINT
- 1
- 8
- PRINT
-
-
- 9
- 0001-6322
- 0001-6322
- 807
- PRINT
- 1
- 10
- PRINT
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1748
- 2380-8195
- 2380-8195
- 947
- ELECTRONIC
- 2
- 1749
- ELECTRONIC
-
-
- 1749
- 2469-990X
- 2469-990X
- 684
- ELECTRONIC
- 2
- 1750
- ELECTRONIC
-
-
- 1751
- 2469-9950
- 2469-9950
- 41
- PRINT
- 1
- 1752
- PRINT
-
-
- 1753
- 2470-0010
- 2470-0010
- 80
- PRINT
- 1
- 1754
- PRINT
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
- ELECTRONIC
-
-
-
-
774 rows × 7 columns
-
-
-
-
-
-```python
-# tester les lignes avec type diff
-issns.loc[issns['format'] != issns['type']]
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 1
- 1399-0039
- 0001-2815
- 532
- NaN
- 3
- 2
- ELECTRONIC
-
-
- 3
- 1520-4898
- 0001-4842
- 498
- NaN
- 3
- 4
- ELECTRONIC
-
-
- 5
- 1520-8524
- 0001-4966
- 789
- NaN
- 3
- 6
- NaN
-
-
- 6
- 1520-9024
- 0001-4966
- 789
- NaN
- 3
- 7
- NaN
-
-
- 8
- 0942-0940
- 0001-6268
- 166
- NaN
- 3
- 9
- ELECTRONIC
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1754
- 2470-0029
- 2470-0010
- 80
- NaN
- 3
- 1755
- ELECTRONIC
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
- PRINT
-
-
- 1756
- 2470-0053
- 2470-0045
- 533
- NaN
- 3
- 1757
- ELECTRONIC
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
- NaN
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
- NaN
-
-
-
-
986 rows × 7 columns
-
-
-
-
-
-```python
-# attribution de l'id du type avec préference par ISSN.org puis Sherpa
-# PRINT = 1
-# ELECTRONIC = 2
-# OTHER = 3
-issns['issn_type'] = issns['format']
-issns.loc[issns['format'].isna(), 'issn_type'] = issns['type']
-issns['issn_type'] = issns['issn_type'].str.replace('PRINT', '1')
-issns['issn_type'] = issns['issn_type'].str.replace('ELECTRONIC', '2')
-issns['issn_type'] = issns['issn_type'].str.replace('OTHER', '3')
-issns['issn_type'] = issns['issn_type'].fillna(3)
-issns
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
- PRINT
-
-
- 1
- 1399-0039
- 0001-2815
- 532
- NaN
- 2
- 2
- ELECTRONIC
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
- PRINT
-
-
- 3
- 1520-4898
- 0001-4842
- 498
- NaN
- 2
- 4
- ELECTRONIC
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
- PRINT
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
- PRINT
-
-
- 1756
- 2470-0053
- 2470-0045
- 533
- NaN
- 2
- 1757
- ELECTRONIC
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
- ELECTRONIC
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
- NaN
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
- NaN
-
-
-
-
1760 rows × 7 columns
-
-
-
-
-
-```python
-# test de diffs
-issns.loc[issns['format'] == 'PRINT'].loc[issns['type'] == 'ELECTRONIC']
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 1123
- 0959-8138
- 0959-8138
- 383
- PRINT
- 1
- 1124
- ELECTRONIC
-
-
- 1191
- 1025-496X
- 1025-496X
- 779
- PRINT
- 1
- 1192
- ELECTRONIC
-
-
- 1451
- 1465-6906
- 1465-6906
- 773
- PRINT
- 1
- 1452
- ELECTRONIC
-
-
-
-
-
-
-
-
-```python
-# test de diffs
-issns.loc[issns['format'] == 'ELECTRONIC'].loc[issns['type'] == 'PRINT']
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 121
- 0009-7330
- 0009-7330
- 948
- ELECTRONIC
- 2
- 122
- PRINT
-
-
- 360
- 0024-3795
- 0024-3795
- 968
- ELECTRONIC
- 2
- 361
- PRINT
-
-
- 595
- 0163-3864
- 0163-3864
- 701
- ELECTRONIC
- 2
- 596
- PRINT
-
-
- 653
- 0194-911X
- 0194-911X
- 871
- ELECTRONIC
- 2
- 654
- PRINT
-
-
- 665
- 0197-9337
- 0197-9337
- 672
- ELECTRONIC
- 2
- 666
- PRINT
-
-
- 711
- 0270-6474
- 0270-6474
- 73
- ELECTRONIC
- 2
- 712
- PRINT
-
-
- 734
- 0278-2391
- 0278-2391
- 521
- ELECTRONIC
- 2
- 735
- PRINT
-
-
- 928
- 0743-7463
- 0743-7463
- 114
- ELECTRONIC
- 2
- 929
- PRINT
-
-
- 1205
- 1040-4651
- 1040-4651
- 886
- ELECTRONIC
- 2
- 1206
- PRINT
-
-
- 1243
- 1059-7794
- 1059-7794
- 440
- ELECTRONIC
- 2
- 1244
- PRINT
-
-
- 1287
- 1079-5642
- 1079-5642
- 468
- ELECTRONIC
- 2
- 1288
- PRINT
-
-
- 1503
- 1528-3542
- 1528-3542
- 547
- ELECTRONIC
- 2
- 1504
- PRINT
-
-
- 1513
- 1530-6984
- 1530-6984
- 36
- ELECTRONIC
- 2
- 1514
- PRINT
-
-
- 1515
- 1534-4320
- 1534-4320
- 735
- ELECTRONIC
- 2
- 1516
- PRINT
-
-
- 1538
- 1549-9618
- 1549-9618
- 158
- ELECTRONIC
- 2
- 1539
- PRINT
-
-
- 1546
- 1553-734X
- 1553-734X
- 240
- ELECTRONIC
- 2
- 1547
- PRINT
-
-
- 1661
- 1876-6102
- 1876-6102
- 249
- ELECTRONIC
- 2
- 1662
- PRINT
-
-
- 1662
- 1877-0568
- 1877-0568
- 675
- ELECTRONIC
- 2
- 1663
- PRINT
-
-
- 1663
- 1877-7058
- 1877-7058
- 632
- ELECTRONIC
- 2
- 1664
- PRINT
-
-
- 1730
- 2211-1247
- 2211-1247
- 113
- ELECTRONIC
- 2
- 1731
- PRINT
-
-
-
-
-
-
-
-
-```python
-# test de diffs
-issns.loc[issns['format'].isna()].loc[issns['type'] == 'PRINT']
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 31
- 0003-2670
- 0003-2670
- 415
- NaN
- 1
- 32
- PRINT
-
-
- 127
- 0010-3616
- 0010-3616
- 417
- NaN
- 1
- 128
- PRINT
-
-
- 151
- 0012-9402
- 0012-9402
- 237
- NaN
- 1
- 152
- PRINT
-
-
- 216
- 0018-9375
- 0018-9375
- 361
- NaN
- 1
- 217
- PRINT
-
-
- 376
- 0026-4598
- 0026-4598
- 496
- NaN
- 1
- 377
- PRINT
-
-
- 643
- 0178-8051
- 0178-8051
- 999
- NaN
- 1
- 644
- PRINT
-
-
- 838
- 1388-6150
- 0368-4466
- 499
- NaN
- 1
- 839
- PRINT
-
-
- 1192
- 1560-7917
- 1025-496X
- 779
- NaN
- 1
- 1193
- PRINT
-
-
- 1201
- 1126-6708
- 1029-8479
- 7
- NaN
- 1
- 1202
- PRINT
-
-
- 1249
- 1063-651X
- 1063-651X
- 588
- NaN
- 1
- 1250
- PRINT
-
-
- 1531
- 1538-7933
- 1538-7836
- 148
- NaN
- 1
- 1532
- PRINT
-
-
- 1560
- 1569-9293
- 1569-9285
- 822
- NaN
- 1
- 1561
- PRINT
-
-
- 1597
- 1662-4548
- 1662-453X
- 421
- NaN
- 1
- 1598
- PRINT
-
-
- 1658
- 8756-3282
- 1873-2763
- 488
- NaN
- 1
- 1659
- PRINT
-
-
-
-
-
-
-
-
-```python
-# convertir journal en int
-issns['journal'] = issns['journal'].astype(int)
-```
-
-
-```python
-# convertir l'index en id
-issns = issns.reset_index()
-issns['id'] = issns['index'] + 1
-del issns['index']
-issns
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- journal
- format
- issn_type
- id
- type
-
-
-
-
- 0
- 0001-2815
- 0001-2815
- 532
- PRINT
- 1
- 1
- PRINT
-
-
- 1
- 1399-0039
- 0001-2815
- 532
- NaN
- 2
- 2
- ELECTRONIC
-
-
- 2
- 0001-4842
- 0001-4842
- 498
- PRINT
- 1
- 3
- PRINT
-
-
- 3
- 1520-4898
- 0001-4842
- 498
- NaN
- 2
- 4
- ELECTRONIC
-
-
- 4
- 0001-4966
- 0001-4966
- 789
- PRINT
- 1
- 5
- PRINT
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 2470-0045
- 2470-0045
- 533
- OTHER
- 3
- 1756
- PRINT
-
-
- 1756
- 2470-0053
- 2470-0045
- 533
- NaN
- 2
- 1757
- ELECTRONIC
-
-
- 1757
- 2475-9953
- 2475-9953
- 608
- ELECTRONIC
- 2
- 1758
- ELECTRONIC
-
-
- 1758
- 2504-4427
- 2504-4427
- 994
- PRINT
- 1
- 1759
- NaN
-
-
- 1759
- 2504-4435
- 2504-4427
- 994
- NaN
- 3
- 1760
- NaN
-
-
-
-
1760 rows × 7 columns
-
-
-
-
-
-```python
-issns['issn_type'] = issns['issn_type'].astype(int)
-```
-
-
-```python
-issns_export = issns[['id', 'issn', 'journal', 'issn_type']]
-issns_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- journal
- issn_type
-
-
-
-
- 0
- 1
- 0001-2815
- 532
- 1
-
-
- 1
- 2
- 1399-0039
- 532
- 2
-
-
- 2
- 3
- 0001-4842
- 498
- 1
-
-
- 3
- 4
- 1520-4898
- 498
- 2
-
-
- 4
- 5
- 0001-4966
- 789
- 1
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 1756
- 2470-0045
- 533
- 3
-
-
- 1756
- 1757
- 2470-0053
- 533
- 2
-
-
- 1757
- 1758
- 2475-9953
- 608
- 2
-
-
- 1758
- 1759
- 2504-4427
- 994
- 1
-
-
- 1759
- 1760
- 2504-4435
- 994
- 3
-
-
-
-
1760 rows × 4 columns
-
-
-
-
-
-```python
-# supprimer les doublons par ISSN
-issns_export = issns_export.drop_duplicates(subset='issn')
-issns_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- journal
- issn_type
-
-
-
-
- 0
- 1
- 0001-2815
- 532
- 1
-
-
- 1
- 2
- 1399-0039
- 532
- 2
-
-
- 2
- 3
- 0001-4842
- 498
- 1
-
-
- 3
- 4
- 1520-4898
- 498
- 2
-
-
- 4
- 5
- 0001-4966
- 789
- 1
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 1756
- 2470-0045
- 533
- 3
-
-
- 1756
- 1757
- 2470-0053
- 533
- 2
-
-
- 1757
- 1758
- 2475-9953
- 608
- 2
-
-
- 1758
- 1759
- 2504-4427
- 994
- 1
-
-
- 1759
- 1760
- 2504-4435
- 994
- 3
-
-
-
-
1760 rows × 4 columns
-
-
-
-
-
-```python
-# esport JSON
-result = issns_export.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/issn.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-issns_export.to_csv('sample/issn.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-issns_export.to_excel('sample/issn.xlsx', index=False)
-```
diff --git a/import_scripts/08_oacct_sherpa_issns.py b/import_scripts/08_oacct_sherpa_issns.py
deleted file mode 100644
index b48bac00..00000000
--- a/import_scripts/08_oacct_sherpa_issns.py
+++ /dev/null
@@ -1,185 +0,0 @@
-#!/usr/bin/env python
-# coding: utf-8
-
-# # Projet Open Access Compliance Check Tool (OACCT)
-#
-# Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-#
-# Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.
-#
-# Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-# Date de dernière mise à jour : 16.07.2021
-
-# ## Table ISSNs
-
-# In[1]:
-
-
-import pandas as pd
-import csv
-import json
-import numpy as np
-
-
-# In[2]:
-
-
-issns = pd.read_csv('sample/issn_brut.tsv', encoding='utf-8', sep='\t')
-issns
-
-
-# ## Ajout du format à partir de Sherpa
-
-# In[3]:
-
-
-# ajout du format par sherpa
-issn_sherpa = pd.read_csv('sample/issn_sherpa.tsv', encoding='utf-8', sep='\t')
-issn_sherpa
-
-
-# In[4]:
-
-
-issn_sherpa['type'] = issn_sherpa['type'].str.upper()
-issn_sherpa
-
-
-# In[5]:
-
-
-issns = pd.merge(issns, issn_sherpa[['issn', 'type']], on='issn', how='outer')
-issns
-
-
-# In[6]:
-
-
-issns['format'].value_counts()
-
-
-# In[7]:
-
-
-issns['type'].value_counts()
-
-
-# In[8]:
-
-
-# tester les lignes sans type
-issns.loc[issns['format'].isnull()].loc[issns['type'].isnull()]
-
-
-# In[9]:
-
-
-# tester les lignes avec type égal
-issns.loc[issns['format'] == issns['type']]
-
-
-# In[10]:
-
-
-# tester les lignes avec type diff
-issns.loc[issns['format'] != issns['type']]
-
-
-# In[11]:
-
-
-# attribution de l'id du type avec préference par ISSN.org puis Sherpa
-# PRINT = 1
-# ELECTRONIC = 2
-# OTHER = 3
-issns['issn_type'] = issns['format']
-issns.loc[issns['format'].isna(), 'issn_type'] = issns['type']
-issns['issn_type'] = issns['issn_type'].str.replace('PRINT', '1')
-issns['issn_type'] = issns['issn_type'].str.replace('ELECTRONIC', '2')
-issns['issn_type'] = issns['issn_type'].str.replace('OTHER', '3')
-issns['issn_type'] = issns['issn_type'].fillna(3)
-issns
-
-
-# In[12]:
-
-
-# test de diffs
-issns.loc[issns['format'] == 'PRINT'].loc[issns['type'] == 'ELECTRONIC']
-
-
-# In[13]:
-
-
-# test de diffs
-issns.loc[issns['format'] == 'ELECTRONIC'].loc[issns['type'] == 'PRINT']
-
-
-# In[14]:
-
-
-# test de diffs
-issns.loc[issns['format'].isna()].loc[issns['type'] == 'PRINT']
-
-
-# In[15]:
-
-
-# convertir journal en int
-issns['journal'] = issns['journal'].astype(int)
-
-
-# In[16]:
-
-
-# convertir l'index en id
-issns = issns.reset_index()
-issns['id'] = issns['index'] + 1
-del issns['index']
-issns
-
-
-# In[17]:
-
-
-issns['issn_type'] = issns['issn_type'].astype(int)
-
-
-# In[18]:
-
-
-issns_export = issns[['id', 'issn', 'journal', 'issn_type']]
-issns_export
-
-
-# In[19]:
-
-
-# supprimer les doublons par ISSN
-issns_export = issns_export.drop_duplicates(subset='issn')
-issns_export
-
-
-# In[20]:
-
-
-# esport JSON
-result = issns_export.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/issn.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[21]:
-
-
-# export csv
-issns_export.to_csv('sample/issn.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[22]:
-
-
-# export excel
-issns_export.to_excel('sample/issn.xlsx', index=False)
-
diff --git a/import_scripts/09_oacct_read_and_publish.md b/import_scripts/09_oacct_read_and_publish.md
deleted file mode 100644
index df115477..00000000
--- a/import_scripts/09_oacct_read_and_publish.md
+++ /dev/null
@@ -1,9540 +0,0 @@
-# Projet Open Access Compliance Check Tool (OACCT)
-
-Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-
-Ce notebook permet de modifier les données extraites des differentes sources et les exporter dans les tables de l'application OACCT.
-
-Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-Date de dernière mise à jour : 08.09.2021
-
-
-```python
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-# afficher toutes les colonnes
-pd.set_option('display.max_columns', None)
-# definir le debut des ids
-id_start = 1
-```
-
-## Ajout des rabais pour les revues des licences Read & Publish
-
-Journals list by publisher :
- * https://consortium.ch/elsevier_titlelist_publication
- * https://consortium.ch/springer_titlelist_publication
- * https://consortium.ch/wiley_titlelist_publish
- * https://consortium.ch/tandf_titlelist_publish
- * https://consortium.ch/sage_titlelist_publish
- * https://consortium.ch/cup_titlelist_publish
-
-Licence term :
- * Elsevier : 2020-2023
- * Springer Nature : 2020-2022
- * Wiley : 2021-2024
- * Taylor & Francis : 2021-2023
- * Cambridge University Press (CUP) : 2021-2023
-
-CC licences :
- * Elsevier : CC-BY, CC-BY-NC-ND
- * Springer Nature : CC-BY, CC-BY-NC
- * Wiley : CC-BY, CC-BY-NC, CC-BY-NC-ND
- * Taylor & Francis : CC-BY
- * Cambridge University Press (CUP) : CC-BY, CC-BY-NC, CC-BY-NC-ND, CC-BY-NC-SA
-
-Special conditions :
- * Cambridge University Press (CUP) : Only the following article types are covered: Research Articles, Review Articles, Rapid Communication, Brief Reports and Case Reports
-
-
-
-## Import du fichier des issns
-
-
-```python
-issn = pd.read_csv('sample/issn.tsv', encoding='utf-8', header=0, sep='\t')
-issn
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- journal
- issn_type
-
-
-
-
- 0
- 1
- 0001-2815
- 532
- 1
-
-
- 1
- 2
- 1399-0039
- 532
- 2
-
-
- 2
- 3
- 0001-4842
- 498
- 1
-
-
- 3
- 4
- 1520-4898
- 498
- 2
-
-
- 4
- 5
- 0001-4966
- 789
- 1
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 1756
- 2470-0045
- 533
- 3
-
-
- 1756
- 1757
- 2470-0053
- 533
- 2
-
-
- 1757
- 1758
- 2475-9953
- 608
- 2
-
-
- 1758
- 1759
- 2504-4427
- 994
- 1
-
-
- 1759
- 1760
- 2504-4435
- 994
- 3
-
-
-
-
1760 rows × 4 columns
-
-
-
-
-
-```python
-# open publishers
-publisher = pd.read_csv('sample/publisher.tsv', encoding='utf-8', header=0, sep='\t')
-publisher
-```
-
-
-
-
-
-
-
-
-
-
- name
- country
- city
- state
- starting_year
- website
- oa_policies
- id
-
-
-
-
- 0
- Revue Médicale Suisse
- 999999
- NaN
- NaN
- 0
- NaN
- NaN
- 1
-
-
- 1
- American Physical Society
- 236
- NaN
- NaN
- 0
- http://www.aps.org/
- NaN
- 2
-
-
- 2
- Public Library of Science
- 236
- NaN
- NaN
- 0
- http://www.plos.org/
- NaN
- 3
-
-
- 3
- The Global Studies Institute de l’Université d...
- 999999
- NaN
- NaN
- 0
- NaN
- NaN
- 4
-
-
- 4
- Universitat de València, Departamento de Teorí...
- 999999
- NaN
- NaN
- 0
- NaN
- NaN
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 191
- [American Medical Association]
- 999999
- NaN
- NaN
- 0
- http://archneur.jamanetwork.com/issues.aspx
- NaN
- 192
-
-
- 192
- Société botanique de Genève
- 999999
- NaN
- NaN
- 0
- NaN
- NaN
- 193
-
-
- 193
- Red.: Prof. Dr. F. Cavalli, Istituto oncologic...
- 999999
- NaN
- NaN
- 0
- NaN
- NaN
- 194
-
-
- 194
- Generative Grammar Group of the Department of ...
- 999999
- NaN
- NaN
- 0
- NaN
- NaN
- 195
-
-
- 195
- UNKNOWN
- 999999
- NaN
- NaN
- 0
- NaN
- NaN
- 196
-
-
-
-
196 rows × 8 columns
-
-
-
-
-
-```python
-publisher.loc[publisher['name'] == 'Elsevier']
-```
-
-
-
-
-
-
-
-
-
-
- name
- country
- city
- state
- starting_year
- website
- oa_policies
- id
-
-
-
-
- 10
- Elsevier
- 236
- NaN
- NaN
- 0
- http://www.elsevier.com/
- NaN
- 11
-
-
-
-
-
-
-
-
-```python
-publisher.loc[(publisher['name'] == 'Springer Verlag') | (publisher['name'] == 'Nature Research')]
-```
-
-
-
-
-
-
-
-
-
-
- name
- country
- city
- state
- starting_year
- website
- oa_policies
- id
-
-
-
-
- 8
- Nature Research
- 234
- NaN
- NaN
- 0
- http://www.nature.com/
- NaN
- 9
-
-
- 28
- Springer Verlag
- 83
- NaN
- NaN
- 0
- http://www.springerlink.com/?MUD=MP
- NaN
- 29
-
-
-
-
-
-
-
-
-```python
-publisher.loc[publisher['name'] == 'Wiley']
-```
-
-
-
-
-
-
-
-
-
-
- name
- country
- city
- state
- starting_year
- website
- oa_policies
- id
-
-
-
-
- 11
- Wiley
- 236
- NaN
- NaN
- 0
- https://www.wiley.com/en-gb
- NaN
- 12
-
-
-
-
-
-
-
-
-```python
-publisher.loc[publisher['name'] == 'Taylor and Francis']
-```
-
-
-
-
-
-
-
-
-
-
- name
- country
- city
- state
- starting_year
- website
- oa_policies
- id
-
-
-
-
- 23
- Taylor and Francis
- 234
- NaN
- NaN
- 0
- http://www.tandf.co.uk/journals/default.asp
- NaN
- 24
-
-
-
-
-
-
-
-
-```python
-publisher.loc[publisher['name'] == 'Cambridge University Press']
-```
-
-
-
-
-
-
-
-
-
-
- name
- country
- city
- state
- starting_year
- website
- oa_policies
- id
-
-
-
-
- 60
- Cambridge University Press
- 234
- NaN
- NaN
- 0
- http://www.cambridge.org/uk/
- NaN
- 61
-
-
-
-
-
-
-
-
-```python
-# ouvrir la liste d'organisations
-participants = pd.read_csv('agreements/consortium_institutions_participation_read_and_publish.csv', encoding='utf-8', header=0, sep='\t')
-participants
-```
-
-
-
-
-
-
-
-
-
-
- Institution
- Elsevier
- Springer Nature
- Wiley
- ROR
-
-
-
-
- 0
- Agroscope
- x
- x
- x
- https://ror.org/04d8ztx87
-
-
- 1
- Berner Fachhochschule BFH
- x
- x
- x
- https://ror.org/02bnkt322
-
-
- 2
- CERN
- NaN
- x
- x
- https://ror.org/01ggx4157
-
-
- 3
- Eidgenössisches Hochschulinstitut für Berufsbi...
- x
- x
- x
- https://ror.org/00zg4za48
-
-
- 4
- EPF Lausanne
- x
- x
- x
- https://ror.org/02s376052
-
-
- 5
- ETH Zürich
- x
- x
- x
- https://ror.org/05a28rw58
-
-
- 6
- Fachhochschule Graubünden FHGR
- x
- x
- x
- https://ror.org/032ymzc07
-
-
- 7
- Fachhochschule Nordwestschweiz FHNW
- x
- x
- x
- https://ror.org/04mq2g308
-
-
- 8
- Forschungsinstitut für biologischen Landbau FibL
- x
- x
- x
- https://ror.org/0210tb741
-
-
- 9
- Graduate Institute (IHEID) – since 2021
- x
- x
- x
- https://ror.org/007ygn379
-
-
- 10
- Haute école spécialisée de Suisse occidentale ...
- x
- x
- x
- https://ror.org/01xkakk17
-
-
- 11
- HEP Berne, Jura, Neuchâtel (HEP-BEJUNE)
- x
- x
- x
- https://ror.org/015pmkr43
-
-
- 12
- HEP Fribourg (PHFR)
- x
- x
- x
- https://ror.org/048gre751
-
-
- 13
- HEP Vaud
- x
- x
- x
- https://ror.org/01bvm0h13
-
-
- 14
- Hochschule für Wirtschaft Zürich HWZ
- x
- x
- x
- https://ror.org/02ejkey04
-
-
- 15
- Hochschule Luzern HSLU
- x
- x
- x
- https://ror.org/04nd0xd48
-
-
- 16
- Interkantonale Hochschule für Heilpädagogik (HfH)
- x
- x
- x
- https://ror.org/00w9q2c06
-
-
- 17
- Kalaidos
- x
- x
- x
- https://ror.org/049c2kr37
-
-
- 18
- Lib4RI
- x
- x
- x
- https://ror.org/021f7p178
-
-
- 19
- Medi
- NaN
- x
- NaN
- NaN
-
-
- 20
- MMV - Medicine for Malaria Ventures
- x
- x
- x
- https://ror.org/00p9jf779
-
-
- 21
- Ostschweizer Fachhochschulen OST
- x
- x
- x
- https://ror.org/038mj2660
-
-
- 22
- Pädagogische Hochschule Zürich PHZH
- x
- x
- x
- https://ror.org/01awgk221
-
-
- 23
- PH Bern
- x
- x
- x
- https://ror.org/05jf1ma54
-
-
- 24
- PH Graubünden (PHGR)
- x
- x
- x
- https://ror.org/02fjgft97
-
-
- 25
- PH Luzern
- x
- x
- x
- https://ror.org/0235ynq74
-
-
- 26
- PH Schaffhausen (PHSH)
- x
- x
- x
- https://ror.org/03fs41j10
-
-
- 27
- PH Schwyz
- x
- x
- x
- https://ror.org/00rqdn375
-
-
- 28
- PH St. Gallen (PHSG)
- x
- x
- x
- https://ror.org/05m37v666
-
-
- 29
- PH Thurgau (PHTG)
- x
- x
- x
- https://ror.org/04bf6dq94
-
-
- 30
- PH Wallis / HEP Valais
- x
- x
- x
- https://ror.org/040gs8e06
-
-
- 31
- PH Zug
- x
- x
- x
- https://ror.org/05ghhx264
-
-
- 32
- Schweizerische Vogelwarte
- x
- x
- x
- https://ror.org/03mcsbr76
-
-
- 33
- Scuola universitaria professionale della Svizz...
- x
- x
- x
- https://ror.org/05ep8g269
-
-
- 34
- Università della Svizzera italiana USI
- x
- x
- x
- https://ror.org/03c4atk17
-
-
- 35
- Universität Basel
- x
- x
- x
- https://ror.org/02s6k3f65
-
-
- 36
- Universität Bern
- x
- x
- x
- https://ror.org/02k7v4d05
-
-
- 37
- Universität Liechtenstein
- x
- x
- x
- https://ror.org/01qjrx392
-
-
- 38
- Universität Luzern
- x
- x
- x
- https://ror.org/00kgrkn83
-
-
- 39
- Universität St. Gallen
- x
- x
- x
- https://ror.org/0561a3s31
-
-
- 40
- Universität Zürich
- x
- x
- x
- https://ror.org/02crff812
-
-
- 41
- Université de Fribourg
- x
- x
- x
- https://ror.org/022fs9h90
-
-
- 42
- Université de Genève
- x
- x
- x
- https://ror.org/01swzsf04
-
-
- 43
- Université de Lausanne
- x
- x
- x
- https://ror.org/019whta54
-
-
- 44
- Université de Neuchâtel
- x
- x
- x
- https://ror.org/00vasag41
-
-
- 45
- Zürcher Hochschule der Künste ZHdK
- x
- x
- x
- https://ror.org/05r0ap620
-
-
- 46
- Zürcher Hochschule für Angewandte Wissenschaft...
- x
- x
- x
- https://ror.org/05pmsvm27
-
-
-
-
-
-
-
-
-```python
-# suppression de Lib4RI qui est une bibliothèque
-participants = participants.loc[participants['Institution'] != 'Lib4RI']
-participants
-```
-
-
-
-
-
-
-
-
-
-
- Institution
- Elsevier
- Springer Nature
- Wiley
- ROR
-
-
-
-
- 0
- Agroscope
- x
- x
- x
- https://ror.org/04d8ztx87
-
-
- 1
- Berner Fachhochschule BFH
- x
- x
- x
- https://ror.org/02bnkt322
-
-
- 2
- CERN
- NaN
- x
- x
- https://ror.org/01ggx4157
-
-
- 3
- Eidgenössisches Hochschulinstitut für Berufsbi...
- x
- x
- x
- https://ror.org/00zg4za48
-
-
- 4
- EPF Lausanne
- x
- x
- x
- https://ror.org/02s376052
-
-
- 5
- ETH Zürich
- x
- x
- x
- https://ror.org/05a28rw58
-
-
- 6
- Fachhochschule Graubünden FHGR
- x
- x
- x
- https://ror.org/032ymzc07
-
-
- 7
- Fachhochschule Nordwestschweiz FHNW
- x
- x
- x
- https://ror.org/04mq2g308
-
-
- 8
- Forschungsinstitut für biologischen Landbau FibL
- x
- x
- x
- https://ror.org/0210tb741
-
-
- 9
- Graduate Institute (IHEID) – since 2021
- x
- x
- x
- https://ror.org/007ygn379
-
-
- 10
- Haute école spécialisée de Suisse occidentale ...
- x
- x
- x
- https://ror.org/01xkakk17
-
-
- 11
- HEP Berne, Jura, Neuchâtel (HEP-BEJUNE)
- x
- x
- x
- https://ror.org/015pmkr43
-
-
- 12
- HEP Fribourg (PHFR)
- x
- x
- x
- https://ror.org/048gre751
-
-
- 13
- HEP Vaud
- x
- x
- x
- https://ror.org/01bvm0h13
-
-
- 14
- Hochschule für Wirtschaft Zürich HWZ
- x
- x
- x
- https://ror.org/02ejkey04
-
-
- 15
- Hochschule Luzern HSLU
- x
- x
- x
- https://ror.org/04nd0xd48
-
-
- 16
- Interkantonale Hochschule für Heilpädagogik (HfH)
- x
- x
- x
- https://ror.org/00w9q2c06
-
-
- 17
- Kalaidos
- x
- x
- x
- https://ror.org/049c2kr37
-
-
- 19
- Medi
- NaN
- x
- NaN
- NaN
-
-
- 20
- MMV - Medicine for Malaria Ventures
- x
- x
- x
- https://ror.org/00p9jf779
-
-
- 21
- Ostschweizer Fachhochschulen OST
- x
- x
- x
- https://ror.org/038mj2660
-
-
- 22
- Pädagogische Hochschule Zürich PHZH
- x
- x
- x
- https://ror.org/01awgk221
-
-
- 23
- PH Bern
- x
- x
- x
- https://ror.org/05jf1ma54
-
-
- 24
- PH Graubünden (PHGR)
- x
- x
- x
- https://ror.org/02fjgft97
-
-
- 25
- PH Luzern
- x
- x
- x
- https://ror.org/0235ynq74
-
-
- 26
- PH Schaffhausen (PHSH)
- x
- x
- x
- https://ror.org/03fs41j10
-
-
- 27
- PH Schwyz
- x
- x
- x
- https://ror.org/00rqdn375
-
-
- 28
- PH St. Gallen (PHSG)
- x
- x
- x
- https://ror.org/05m37v666
-
-
- 29
- PH Thurgau (PHTG)
- x
- x
- x
- https://ror.org/04bf6dq94
-
-
- 30
- PH Wallis / HEP Valais
- x
- x
- x
- https://ror.org/040gs8e06
-
-
- 31
- PH Zug
- x
- x
- x
- https://ror.org/05ghhx264
-
-
- 32
- Schweizerische Vogelwarte
- x
- x
- x
- https://ror.org/03mcsbr76
-
-
- 33
- Scuola universitaria professionale della Svizz...
- x
- x
- x
- https://ror.org/05ep8g269
-
-
- 34
- Università della Svizzera italiana USI
- x
- x
- x
- https://ror.org/03c4atk17
-
-
- 35
- Universität Basel
- x
- x
- x
- https://ror.org/02s6k3f65
-
-
- 36
- Universität Bern
- x
- x
- x
- https://ror.org/02k7v4d05
-
-
- 37
- Universität Liechtenstein
- x
- x
- x
- https://ror.org/01qjrx392
-
-
- 38
- Universität Luzern
- x
- x
- x
- https://ror.org/00kgrkn83
-
-
- 39
- Universität St. Gallen
- x
- x
- x
- https://ror.org/0561a3s31
-
-
- 40
- Universität Zürich
- x
- x
- x
- https://ror.org/02crff812
-
-
- 41
- Université de Fribourg
- x
- x
- x
- https://ror.org/022fs9h90
-
-
- 42
- Université de Genève
- x
- x
- x
- https://ror.org/01swzsf04
-
-
- 43
- Université de Lausanne
- x
- x
- x
- https://ror.org/019whta54
-
-
- 44
- Université de Neuchâtel
- x
- x
- x
- https://ror.org/00vasag41
-
-
- 45
- Zürcher Hochschule der Künste ZHdK
- x
- x
- x
- https://ror.org/05r0ap620
-
-
- 46
- Zürcher Hochschule für Angewandte Wissenschaft...
- x
- x
- x
- https://ror.org/05pmsvm27
-
-
-
-
-
-
-
-
-```python
-# ajout de TF et CUP pour tous (TODO : obtenir la liste des bibliothèques pour ces deux licences)
-participants['TF'] = 'x'
-participants['CUP'] = 'x'
-participants
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:3: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- This is separate from the ipykernel package so we can avoid doing imports until
-
-
-
-
-
-
-
-
-
-
-
- Institution
- Elsevier
- Springer Nature
- Wiley
- ROR
- TF
- CUP
-
-
-
-
- 0
- Agroscope
- x
- x
- x
- https://ror.org/04d8ztx87
- x
- x
-
-
- 1
- Berner Fachhochschule BFH
- x
- x
- x
- https://ror.org/02bnkt322
- x
- x
-
-
- 2
- CERN
- NaN
- x
- x
- https://ror.org/01ggx4157
- x
- x
-
-
- 3
- Eidgenössisches Hochschulinstitut für Berufsbi...
- x
- x
- x
- https://ror.org/00zg4za48
- x
- x
-
-
- 4
- EPF Lausanne
- x
- x
- x
- https://ror.org/02s376052
- x
- x
-
-
- 5
- ETH Zürich
- x
- x
- x
- https://ror.org/05a28rw58
- x
- x
-
-
- 6
- Fachhochschule Graubünden FHGR
- x
- x
- x
- https://ror.org/032ymzc07
- x
- x
-
-
- 7
- Fachhochschule Nordwestschweiz FHNW
- x
- x
- x
- https://ror.org/04mq2g308
- x
- x
-
-
- 8
- Forschungsinstitut für biologischen Landbau FibL
- x
- x
- x
- https://ror.org/0210tb741
- x
- x
-
-
- 9
- Graduate Institute (IHEID) – since 2021
- x
- x
- x
- https://ror.org/007ygn379
- x
- x
-
-
- 10
- Haute école spécialisée de Suisse occidentale ...
- x
- x
- x
- https://ror.org/01xkakk17
- x
- x
-
-
- 11
- HEP Berne, Jura, Neuchâtel (HEP-BEJUNE)
- x
- x
- x
- https://ror.org/015pmkr43
- x
- x
-
-
- 12
- HEP Fribourg (PHFR)
- x
- x
- x
- https://ror.org/048gre751
- x
- x
-
-
- 13
- HEP Vaud
- x
- x
- x
- https://ror.org/01bvm0h13
- x
- x
-
-
- 14
- Hochschule für Wirtschaft Zürich HWZ
- x
- x
- x
- https://ror.org/02ejkey04
- x
- x
-
-
- 15
- Hochschule Luzern HSLU
- x
- x
- x
- https://ror.org/04nd0xd48
- x
- x
-
-
- 16
- Interkantonale Hochschule für Heilpädagogik (HfH)
- x
- x
- x
- https://ror.org/00w9q2c06
- x
- x
-
-
- 17
- Kalaidos
- x
- x
- x
- https://ror.org/049c2kr37
- x
- x
-
-
- 19
- Medi
- NaN
- x
- NaN
- NaN
- x
- x
-
-
- 20
- MMV - Medicine for Malaria Ventures
- x
- x
- x
- https://ror.org/00p9jf779
- x
- x
-
-
- 21
- Ostschweizer Fachhochschulen OST
- x
- x
- x
- https://ror.org/038mj2660
- x
- x
-
-
- 22
- Pädagogische Hochschule Zürich PHZH
- x
- x
- x
- https://ror.org/01awgk221
- x
- x
-
-
- 23
- PH Bern
- x
- x
- x
- https://ror.org/05jf1ma54
- x
- x
-
-
- 24
- PH Graubünden (PHGR)
- x
- x
- x
- https://ror.org/02fjgft97
- x
- x
-
-
- 25
- PH Luzern
- x
- x
- x
- https://ror.org/0235ynq74
- x
- x
-
-
- 26
- PH Schaffhausen (PHSH)
- x
- x
- x
- https://ror.org/03fs41j10
- x
- x
-
-
- 27
- PH Schwyz
- x
- x
- x
- https://ror.org/00rqdn375
- x
- x
-
-
- 28
- PH St. Gallen (PHSG)
- x
- x
- x
- https://ror.org/05m37v666
- x
- x
-
-
- 29
- PH Thurgau (PHTG)
- x
- x
- x
- https://ror.org/04bf6dq94
- x
- x
-
-
- 30
- PH Wallis / HEP Valais
- x
- x
- x
- https://ror.org/040gs8e06
- x
- x
-
-
- 31
- PH Zug
- x
- x
- x
- https://ror.org/05ghhx264
- x
- x
-
-
- 32
- Schweizerische Vogelwarte
- x
- x
- x
- https://ror.org/03mcsbr76
- x
- x
-
-
- 33
- Scuola universitaria professionale della Svizz...
- x
- x
- x
- https://ror.org/05ep8g269
- x
- x
-
-
- 34
- Università della Svizzera italiana USI
- x
- x
- x
- https://ror.org/03c4atk17
- x
- x
-
-
- 35
- Universität Basel
- x
- x
- x
- https://ror.org/02s6k3f65
- x
- x
-
-
- 36
- Universität Bern
- x
- x
- x
- https://ror.org/02k7v4d05
- x
- x
-
-
- 37
- Universität Liechtenstein
- x
- x
- x
- https://ror.org/01qjrx392
- x
- x
-
-
- 38
- Universität Luzern
- x
- x
- x
- https://ror.org/00kgrkn83
- x
- x
-
-
- 39
- Universität St. Gallen
- x
- x
- x
- https://ror.org/0561a3s31
- x
- x
-
-
- 40
- Universität Zürich
- x
- x
- x
- https://ror.org/02crff812
- x
- x
-
-
- 41
- Université de Fribourg
- x
- x
- x
- https://ror.org/022fs9h90
- x
- x
-
-
- 42
- Université de Genève
- x
- x
- x
- https://ror.org/01swzsf04
- x
- x
-
-
- 43
- Université de Lausanne
- x
- x
- x
- https://ror.org/019whta54
- x
- x
-
-
- 44
- Université de Neuchâtel
- x
- x
- x
- https://ror.org/00vasag41
- x
- x
-
-
- 45
- Zürcher Hochschule der Künste ZHdK
- x
- x
- x
- https://ror.org/05r0ap620
- x
- x
-
-
- 46
- Zürcher Hochschule für Angewandte Wissenschaft...
- x
- x
- x
- https://ror.org/05pmsvm27
- x
- x
-
-
-
-
-
-
-
-
-```python
-# ouvrir la liste des journaux Elsevier
-elsevier = pd.read_excel('agreements/Elsevier_titlelist_publication.xlsx', skiprows=7)
-elsevier
-```
-
-
-
-
-
-
-
-
-
-
- Title
- ISSN
-
-
-
-
- 0
- Academic Pediatrics
- 1876-2859
-
-
- 1
- Accident Analysis and Prevention
- 0001-4575
-
-
- 2
- Accounting, Organizations and Society
- 0361-3682
-
-
- 3
- Acta Astronautica
- 0094-5765
-
-
- 4
- Acta Biomaterialia
- 1742-7061
-
-
- ...
- ...
- ...
-
-
- 2240
- Wound Medicine
- 2213-9095
-
-
- 2241
- Zeitschrift fuer Evidenz, Fortbildung und Qual...
- 1865-9217
-
-
- 2242
- Zeitschrift fuer Medizinische Physik
- 0939-3889
-
-
- 2243
- Zoologischer Anzeiger
- 0044-5231
-
-
- 2244
- Zoology
- 0944-2006
-
-
-
-
2245 rows × 2 columns
-
-
-
-
-
-```python
-# ajout du champ version
-elsevier['article_version'] = 'published'
-elsevier
-```
-
-
-
-
-
-
-
-
-
-
- Title
- ISSN
- article_version
-
-
-
-
- 0
- Academic Pediatrics
- 1876-2859
- published
-
-
- 1
- Accident Analysis and Prevention
- 0001-4575
- published
-
-
- 2
- Accounting, Organizations and Society
- 0361-3682
- published
-
-
- 3
- Acta Astronautica
- 0094-5765
- published
-
-
- 4
- Acta Biomaterialia
- 1742-7061
- published
-
-
- ...
- ...
- ...
- ...
-
-
- 2240
- Wound Medicine
- 2213-9095
- published
-
-
- 2241
- Zeitschrift fuer Evidenz, Fortbildung und Qual...
- 1865-9217
- published
-
-
- 2242
- Zeitschrift fuer Medizinische Physik
- 0939-3889
- published
-
-
- 2243
- Zoologischer Anzeiger
- 0044-5231
- published
-
-
- 2244
- Zoology
- 0944-2006
- published
-
-
-
-
2245 rows × 3 columns
-
-
-
-
-
-```python
-# ajout des dates
-elsevier['valid_from'] = '2020-01-01'
-elsevier['valid_until'] = '2023-12-31'
-elsevier
-```
-
-
-
-
-
-
-
-
-
-
- Title
- ISSN
- article_version
- valid_from
- valid_until
-
-
-
-
- 0
- Academic Pediatrics
- 1876-2859
- published
- 2020-01-01
- 2023-12-31
-
-
- 1
- Accident Analysis and Prevention
- 0001-4575
- published
- 2020-01-01
- 2023-12-31
-
-
- 2
- Accounting, Organizations and Society
- 0361-3682
- published
- 2020-01-01
- 2023-12-31
-
-
- 3
- Acta Astronautica
- 0094-5765
- published
- 2020-01-01
- 2023-12-31
-
-
- 4
- Acta Biomaterialia
- 1742-7061
- published
- 2020-01-01
- 2023-12-31
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 2240
- Wound Medicine
- 2213-9095
- published
- 2020-01-01
- 2023-12-31
-
-
- 2241
- Zeitschrift fuer Evidenz, Fortbildung und Qual...
- 1865-9217
- published
- 2020-01-01
- 2023-12-31
-
-
- 2242
- Zeitschrift fuer Medizinische Physik
- 0939-3889
- published
- 2020-01-01
- 2023-12-31
-
-
- 2243
- Zoologischer Anzeiger
- 0044-5231
- published
- 2020-01-01
- 2023-12-31
-
-
- 2244
- Zoology
- 0944-2006
- published
- 2020-01-01
- 2023-12-31
-
-
-
-
2245 rows × 5 columns
-
-
-
-
-
-```python
-# ajout du embargo et archiving
-elsevier['embargo_months'] = 0
-elsevier['archiving'] = True
-elsevier
-```
-
-
-
-
-
-
-
-
-
-
- Title
- ISSN
- article_version
- valid_from
- valid_until
- embargo_months
- archiving
-
-
-
-
- 0
- Academic Pediatrics
- 1876-2859
- published
- 2020-01-01
- 2023-12-31
- 0
- True
-
-
- 1
- Accident Analysis and Prevention
- 0001-4575
- published
- 2020-01-01
- 2023-12-31
- 0
- True
-
-
- 2
- Accounting, Organizations and Society
- 0361-3682
- published
- 2020-01-01
- 2023-12-31
- 0
- True
-
-
- 3
- Acta Astronautica
- 0094-5765
- published
- 2020-01-01
- 2023-12-31
- 0
- True
-
-
- 4
- Acta Biomaterialia
- 1742-7061
- published
- 2020-01-01
- 2023-12-31
- 0
- True
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 2240
- Wound Medicine
- 2213-9095
- published
- 2020-01-01
- 2023-12-31
- 0
- True
-
-
- 2241
- Zeitschrift fuer Evidenz, Fortbildung und Qual...
- 1865-9217
- published
- 2020-01-01
- 2023-12-31
- 0
- True
-
-
- 2242
- Zeitschrift fuer Medizinische Physik
- 0939-3889
- published
- 2020-01-01
- 2023-12-31
- 0
- True
-
-
- 2243
- Zoologischer Anzeiger
- 0044-5231
- published
- 2020-01-01
- 2023-12-31
- 0
- True
-
-
- 2244
- Zoology
- 0944-2006
- published
- 2020-01-01
- 2023-12-31
- 0
- True
-
-
-
-
2245 rows × 7 columns
-
-
-
-
-
-```python
-elsevier.iloc[elsevier.shape[0]-1]
-```
-
-
-
-
- Title Zoology
- ISSN 0944-2006
- article_version published
- valid_from 2020-01-01
- valid_until 2023-12-31
- embargo_months 0
- archiving True
- Name: 2244, dtype: object
-
-
-
-
-```python
-# ajout du champ license
-# cc_by, cc_by_nc_nd
-rp = pd.DataFrame()
-elsevier['article_version'] = 'published'
-elsevier['license'] = 'cc_by'
-elsevier['Elsevier'] = 'x'
-rp = rp.append(elsevier, ignore_index=True)
-elsevier['license'] = 'cc_by_nc_nd'
-rp = rp.append(elsevier, ignore_index=True)
-rp
-```
-
-
-
-
-
-
-
-
-
-
- Title
- ISSN
- article_version
- valid_from
- valid_until
- embargo_months
- archiving
- license
- Elsevier
-
-
-
-
- 0
- Academic Pediatrics
- 1876-2859
- published
- 2020-01-01
- 2023-12-31
- 0
- True
- cc_by
- x
-
-
- 1
- Accident Analysis and Prevention
- 0001-4575
- published
- 2020-01-01
- 2023-12-31
- 0
- True
- cc_by
- x
-
-
- 2
- Accounting, Organizations and Society
- 0361-3682
- published
- 2020-01-01
- 2023-12-31
- 0
- True
- cc_by
- x
-
-
- 3
- Acta Astronautica
- 0094-5765
- published
- 2020-01-01
- 2023-12-31
- 0
- True
- cc_by
- x
-
-
- 4
- Acta Biomaterialia
- 1742-7061
- published
- 2020-01-01
- 2023-12-31
- 0
- True
- cc_by
- x
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 4485
- Wound Medicine
- 2213-9095
- published
- 2020-01-01
- 2023-12-31
- 0
- True
- cc_by_nc_nd
- x
-
-
- 4486
- Zeitschrift fuer Evidenz, Fortbildung und Qual...
- 1865-9217
- published
- 2020-01-01
- 2023-12-31
- 0
- True
- cc_by_nc_nd
- x
-
-
- 4487
- Zeitschrift fuer Medizinische Physik
- 0939-3889
- published
- 2020-01-01
- 2023-12-31
- 0
- True
- cc_by_nc_nd
- x
-
-
- 4488
- Zoologischer Anzeiger
- 0044-5231
- published
- 2020-01-01
- 2023-12-31
- 0
- True
- cc_by_nc_nd
- x
-
-
- 4489
- Zoology
- 0944-2006
- published
- 2020-01-01
- 2023-12-31
- 0
- True
- cc_by_nc_nd
- x
-
-
-
-
4490 rows × 9 columns
-
-
-
-
-
-```python
-# ouvrir la liste des journaux Springer Nature
-springer = pd.read_excel('agreements/Springer_titlelist_publication.xlsx', skiprows=7)
-springer
-```
-
-
-
-
-
-
-
-
-
-
- Title
- ISSN
- URL
-
-
-
-
- 0
- 3 Biotech
- 2190-5738
- https://www.springer.com/journal/13205
-
-
- 1
- 4OR
- 1614-2411
- https://www.springer.com/journal/10288
-
-
- 2
- AAPS PharmSciTech
- 1530-9932
- https://www.springer.com/journal/12249
-
-
- 3
- Abdominal Radiology
- 2366-0058
- https://www.springer.com/journal/261
-
-
- 4
- Abhandlungen aus dem Mathematischen Seminar de...
- 1865-8784
- https://www.springer.com/journal/12188
-
-
- ...
- ...
- ...
- ...
-
-
- 2035
- Zeitschrift für Religion, Gesellschaft und Pol...
- 2510-1226
- https://www.springer.com/journal/41682
-
-
- 2036
- Zeitschrift für Rheumatologie
- 1435-1250
- https://www.springer.com/journal/393
-
-
- 2037
- Zeitschrift für Vergleichende Politikwissenschaft
- 1865-2654
- https://www.springer.com/journal/12286
-
-
- 2038
- Zentralblatt für Arbeitsmedizin, Arbeitsschutz...
- 2198-0713
- https://www.springer.com/journal/40664
-
-
- 2039
- Zoomorphology
- 1432-234X
- https://www.springer.com/journal/435
-
-
-
-
2040 rows × 3 columns
-
-
-
-
-
-```python
-# ajout du champ license
-# cc_by, cc_by_nc
-springer['article_version'] = 'published'
-springer['license'] = 'cc_by'
-springer['Springer Nature'] = 'x'
-# ajout des dates
-springer['valid_from'] = '2020-01-01'
-springer['valid_until'] = '2022-12-31'
-# ajout du embargo et archiving
-springer['embargo_months'] = 0
-springer['archiving'] = True
-```
-
-
-```python
-# append
-rp = rp.append(springer, ignore_index=True)
-springer['license'] = 'cc_by_nc'
-rp = rp.append(springer, ignore_index=True)
-rp
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py:7123: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
- of pandas will change to not sort by default.
-
- To accept the future behavior, pass 'sort=False'.
-
- To retain the current behavior and silence the warning, pass 'sort=True'.
-
- sort=sort,
-
-
-
-
-
-
-
-
-
-
-
- Elsevier
- ISSN
- Springer Nature
- Title
- URL
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
-
-
-
-
- 0
- x
- 1876-2859
- NaN
- Academic Pediatrics
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 1
- x
- 0001-4575
- NaN
- Accident Analysis and Prevention
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 2
- x
- 0361-3682
- NaN
- Accounting, Organizations and Society
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 3
- x
- 0094-5765
- NaN
- Acta Astronautica
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 4
- x
- 1742-7061
- NaN
- Acta Biomaterialia
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8565
- NaN
- 2510-1226
- x
- Zeitschrift für Religion, Gesellschaft und Pol...
- https://www.springer.com/journal/41682
- True
- published
- 0
- cc_by_nc
- 2020-01-01
- 2022-12-31
-
-
- 8566
- NaN
- 1435-1250
- x
- Zeitschrift für Rheumatologie
- https://www.springer.com/journal/393
- True
- published
- 0
- cc_by_nc
- 2020-01-01
- 2022-12-31
-
-
- 8567
- NaN
- 1865-2654
- x
- Zeitschrift für Vergleichende Politikwissenschaft
- https://www.springer.com/journal/12286
- True
- published
- 0
- cc_by_nc
- 2020-01-01
- 2022-12-31
-
-
- 8568
- NaN
- 2198-0713
- x
- Zentralblatt für Arbeitsmedizin, Arbeitsschutz...
- https://www.springer.com/journal/40664
- True
- published
- 0
- cc_by_nc
- 2020-01-01
- 2022-12-31
-
-
- 8569
- NaN
- 1432-234X
- x
- Zoomorphology
- https://www.springer.com/journal/435
- True
- published
- 0
- cc_by_nc
- 2020-01-01
- 2022-12-31
-
-
-
-
8570 rows × 11 columns
-
-
-
-
-
-```python
-# ouvrir la liste des journaux Wiley
-wiley = pd.read_excel('agreements/Wiley_titlelist_publish.xlsx', skiprows=7)
-wiley
-```
-
-
-
-
-
-
-
-
-
-
- Title
- ISSN
- URL
-
-
-
-
- 0
- ABACUS
- 1467-6281
- https://onlinelibrary.wiley.com/journal/14676281
-
-
- 1
- ACADEMIC EMERGENCY MEDICINE
- 1553-2712
- https://onlinelibrary.wiley.com/journal/15532712
-
-
- 2
- ACCOUNTING & FINANCE
- 1467-629X
- https://onlinelibrary.wiley.com/journal/1467629X
-
-
- 3
- ACCOUNTING PERSPECTIVES
- 1911-3838
- https://onlinelibrary.wiley.com/journal/19113838
-
-
- 4
- ACTA ANAESTHESIOLOGICA SCANDINAVICA
- 1399-6576
- https://onlinelibrary.wiley.com/journal/13996576
-
-
- ...
- ...
- ...
- ...
-
-
- 1391
- ZEITSCHRIFT FüR ANORGANISCHE UND ALLGEMEINE CH...
- 1521-3749
- https://onlinelibrary.wiley.com/journal/15213749
-
-
- 1392
- ZOO BIOLOGY
- 1098-2361
- https://onlinelibrary.wiley.com/journal/10982361
-
-
- 1393
- ZOOLOGICA SCRIPTA
- 1463-6409
- https://onlinelibrary.wiley.com/journal/14636409
-
-
- 1394
- ZOONOSES AND PUBLIC HEALTH
- 1863-2378
- https://onlinelibrary.wiley.com/journal/18632378
-
-
- 1395
- ZYGON® JOURNAL OF RELIGION AND SCIENCE
- 1467-9744
- https://onlinelibrary.wiley.com/journal/14679744
-
-
-
-
1396 rows × 3 columns
-
-
-
-
-
-```python
-# ajout du champ license
-# cc_by, cc_by_nc, cc_by_nc_nd
-wiley['article_version'] = 'published'
-wiley['license'] = 'cc_by'
-wiley['Wiley'] = 'x'
-# ajout des dates
-wiley['valid_from'] = '2021-01-01'
-wiley['valid_until'] = '2024-12-31'
-# ajout du embargo et archiving
-wiley['embargo_months'] = 0
-wiley['archiving'] = True
-rp = rp.append(wiley, ignore_index=True)
-# append avec une autre licence
-wiley['license'] = 'cc_by_nc'
-rp = rp.append(wiley, ignore_index=True)
-# append avec une autre licence
-wiley['license'] = 'cc_by_nc_nd'
-rp = rp.append(wiley, ignore_index=True)
-rp
-```
-
-
-
-
-
-
-
-
-
-
- Elsevier
- ISSN
- Springer Nature
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
-
-
-
-
- 0
- x
- 1876-2859
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 1
- x
- 0001-4575
- NaN
- Accident Analysis and Prevention
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 2
- x
- 0361-3682
- NaN
- Accounting, Organizations and Society
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 3
- x
- 0094-5765
- NaN
- Acta Astronautica
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 4
- x
- 1742-7061
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 12753
- NaN
- 1521-3749
- NaN
- ZEITSCHRIFT FüR ANORGANISCHE UND ALLGEMEINE CH...
- https://onlinelibrary.wiley.com/journal/15213749
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
-
-
- 12754
- NaN
- 1098-2361
- NaN
- ZOO BIOLOGY
- https://onlinelibrary.wiley.com/journal/10982361
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
-
-
- 12755
- NaN
- 1463-6409
- NaN
- ZOOLOGICA SCRIPTA
- https://onlinelibrary.wiley.com/journal/14636409
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
-
-
- 12756
- NaN
- 1863-2378
- NaN
- ZOONOSES AND PUBLIC HEALTH
- https://onlinelibrary.wiley.com/journal/18632378
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
-
-
- 12757
- NaN
- 1467-9744
- NaN
- ZYGON® JOURNAL OF RELIGION AND SCIENCE
- https://onlinelibrary.wiley.com/journal/14679744
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
-
-
-
-
12758 rows × 12 columns
-
-
-
-
-
-```python
-# ouvrir la liste des journaux TF
-tf = pd.read_excel('agreements/TandF_titlelist_publish.xlsx', skiprows=7)
-tf
-```
-
-
-
-
-
-
-
-
-
-
- Title
- ISSN
-
-
-
-
- 0
- a/b: Auto/Biography Studies
- 2151-7290
-
-
- 1
- Accountability in Research
- 1545-5815
-
-
- 2
- Accounting and Business Research
- 2159-4260
-
-
- 3
- Accounting Education
- 1468-4489
-
-
- 4
- Accounting Forum
- 1467-6303
-
-
- ...
- ...
- ...
-
-
- 2401
- Writing Systems Research
- NaN
-
-
- 2402
- Xenobiotica
- 1366-5928
-
-
- 2403
- Yorkshire Archaeological Journal
- 2045-0664
-
-
- 2404
- Youth Theatre Journal
- 1948-4798
-
-
- 2405
- Zoology in the Middle East
- 2326-2680
-
-
-
-
2406 rows × 2 columns
-
-
-
-
-
-```python
-# ajout du champ license
-# cc_by, cc_by_nc, cc_by_nc_nd
-tf['article_version'] = 'published'
-tf['license'] = 'cc_by'
-tf['TF'] = 'x'
-# ajout des dates
-tf['valid_from'] = '2021-01-01'
-tf['valid_until'] = '2023-12-31'
-# ajout du embargo et archiving
-tf['embargo_months'] = 0
-tf['archiving'] = True
-```
-
-
-```python
-# append
-rp = rp.append(tf, ignore_index=True)
-rp
-```
-
-
-
-
-
-
-
-
-
-
- Elsevier
- ISSN
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
-
-
-
-
- 0
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 1
- x
- 0001-4575
- NaN
- NaN
- Accident Analysis and Prevention
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 2
- x
- 0361-3682
- NaN
- NaN
- Accounting, Organizations and Society
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 3
- x
- 0094-5765
- NaN
- NaN
- Acta Astronautica
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 4
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 15159
- NaN
- NaN
- NaN
- x
- Writing Systems Research
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
-
-
- 15160
- NaN
- 1366-5928
- NaN
- x
- Xenobiotica
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
-
-
- 15161
- NaN
- 2045-0664
- NaN
- x
- Yorkshire Archaeological Journal
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
-
-
- 15162
- NaN
- 1948-4798
- NaN
- x
- Youth Theatre Journal
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
-
-
- 15163
- NaN
- 2326-2680
- NaN
- x
- Zoology in the Middle East
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
-
-
-
-
15164 rows × 13 columns
-
-
-
-
-
-```python
-# ouvrir la liste des journaux CUP
-cup = pd.read_excel('agreements/CUP_Journals_titlelist_publish.xlsx', skiprows=7)
-cup
-```
-
-
-
-
-
-
-
-
-
-
- Title
- e-ISSN
- URL
-
-
-
-
- 0
- Agricultural and Resource Economics Review
- 2372-2614
- http://www.cambridge.org/core/product/identifi...
-
-
- 1
- AJIL Unbound
- 2398-7723
- http://www.cambridge.org/core/product/identifi...
-
-
- 2
- Annals of Glaciology
- 1727-5644
- http://www.cambridge.org/core/product/identifi...
-
-
- 3
- APSIPA Transactions on Signal and Information ...
- 2048-7703
- http://www.cambridge.org/core/product/identifi...
-
-
- 4
- Biological Imaging
- 2633-903X
- http://www.cambridge.org/core/product/identifi...
-
-
- ...
- ...
- ...
- ...
-
-
- 366
- Visual Neuroscience
- 1469-8714
- http://www.cambridge.org/core/product/identifi...
-
-
- 367
- Weed Science
- 1550-2759
- http://www.cambridge.org/core/product/identifi...
-
-
- 368
- Weed Technology
- 1550-2740
- http://www.cambridge.org/core/product/identifi...
-
-
- 369
- World Trade Review
- 1475-3138
- http://www.cambridge.org/core/product/identifi...
-
-
- 370
- Zygote
- 1469-8730
- http://www.cambridge.org/core/product/identifi...
-
-
-
-
371 rows × 3 columns
-
-
-
-
-
-```python
-# renommer l'ISSN
-cup = cup.rename(columns = {'e-ISSN' : 'ISSN'})
-cup
-```
-
-
-
-
-
-
-
-
-
-
- Title
- ISSN
- URL
-
-
-
-
- 0
- Agricultural and Resource Economics Review
- 2372-2614
- http://www.cambridge.org/core/product/identifi...
-
-
- 1
- AJIL Unbound
- 2398-7723
- http://www.cambridge.org/core/product/identifi...
-
-
- 2
- Annals of Glaciology
- 1727-5644
- http://www.cambridge.org/core/product/identifi...
-
-
- 3
- APSIPA Transactions on Signal and Information ...
- 2048-7703
- http://www.cambridge.org/core/product/identifi...
-
-
- 4
- Biological Imaging
- 2633-903X
- http://www.cambridge.org/core/product/identifi...
-
-
- ...
- ...
- ...
- ...
-
-
- 366
- Visual Neuroscience
- 1469-8714
- http://www.cambridge.org/core/product/identifi...
-
-
- 367
- Weed Science
- 1550-2759
- http://www.cambridge.org/core/product/identifi...
-
-
- 368
- Weed Technology
- 1550-2740
- http://www.cambridge.org/core/product/identifi...
-
-
- 369
- World Trade Review
- 1475-3138
- http://www.cambridge.org/core/product/identifi...
-
-
- 370
- Zygote
- 1469-8730
- http://www.cambridge.org/core/product/identifi...
-
-
-
-
371 rows × 3 columns
-
-
-
-
-
-```python
-# ajout du champ license
-# cc_by, cc_by_nc, cc_by_nc_nd, cc_by_nc_sa
-cup['article_version'] = 'published'
-cup['license'] = 'cc_by'
-cup['CUP'] = 'x'
-# ajout des dates
-cup['valid_from'] = '2021-01-01'
-cup['valid_until'] = '2023-12-31'
-# ajout du embargo et archiving
-cup['embargo_months'] = 60
-cup['archiving'] = True
-```
-
-
-```python
-# append
-rp = rp.append(cup, ignore_index=True)
-cup['license'] = 'cc_by_nc'
-rp = rp.append(cup, ignore_index=True)
-cup['license'] = 'cc_by_nc_nd'
-rp = rp.append(cup, ignore_index=True)
-cup['license'] = 'cc_by_nc_sa'
-rp = rp.append(cup, ignore_index=True)
-rp
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- ISSN
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
-
-
-
-
- 0
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 1
- NaN
- x
- 0001-4575
- NaN
- NaN
- Accident Analysis and Prevention
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 2
- NaN
- x
- 0361-3682
- NaN
- NaN
- Accounting, Organizations and Society
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 3
- NaN
- x
- 0094-5765
- NaN
- NaN
- Acta Astronautica
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- 4
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16643
- x
- NaN
- 1469-8714
- NaN
- NaN
- Visual Neuroscience
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
-
-
- 16644
- x
- NaN
- 1550-2759
- NaN
- NaN
- Weed Science
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
-
-
- 16645
- x
- NaN
- 1550-2740
- NaN
- NaN
- Weed Technology
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
-
-
- 16646
- x
- NaN
- 1475-3138
- NaN
- NaN
- World Trade Review
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
-
-
- 16647
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
-
-
-
-
16648 rows × 14 columns
-
-
-
-
-
-```python
-# test des lignes sans embargo
-rp.loc[rp['embargo_months'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- ISSN
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
-
-
-
-
-
-
-
-
-
-
-```python
-# ajout des ISSN-L
-issnl = pd.read_csv('issn/20171102.ISSN-to-ISSN-L.txt', encoding='utf-8', header=0, sep='\t')
-issnl
-```
-
-
-
-
-
-
-
-
-
-
- ISSN
- ISSN-L
-
-
-
-
- 0
- 0000-0019
- 0000-0019
-
-
- 1
- 0000-0027
- 0000-0027
-
-
- 2
- 0000-0043
- 0000-0043
-
-
- 3
- 0000-0051
- 0000-0051
-
-
- 4
- 0000-006X
- 0000-006X
-
-
- ...
- ...
- ...
-
-
- 1995913
- 8756-9957
- 8756-9957
-
-
- 1995914
- 8756-9965
- 8756-9965
-
-
- 1995915
- 8756-9973
- 8756-9973
-
-
- 1995916
- 8756-9981
- 8756-9981
-
-
- 1995917
- 8756-999X
- 8756-999X
-
-
-
-
1995918 rows × 2 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-issnl = issnl.rename(columns={'ISSN' : 'issn', 'ISSN-L' : 'issnl'})
-rp = rp.rename(columns={'ISSN' : 'issn'})
-```
-
-
-```python
-# merge
-rp = pd.merge(rp, issnl, on='issn', how='left')
-rp
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
-
-
-
-
- 0
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
-
-
- 1
- NaN
- x
- 0001-4575
- NaN
- NaN
- Accident Analysis and Prevention
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 0001-4575
-
-
- 2
- NaN
- x
- 0361-3682
- NaN
- NaN
- Accounting, Organizations and Society
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 0361-3682
-
-
- 3
- NaN
- x
- 0094-5765
- NaN
- NaN
- Acta Astronautica
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 0094-5765
-
-
- 4
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16643
- x
- NaN
- 1469-8714
- NaN
- NaN
- Visual Neuroscience
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0952-5238
-
-
- 16644
- x
- NaN
- 1550-2759
- NaN
- NaN
- Weed Science
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0043-1745
-
-
- 16645
- x
- NaN
- 1550-2740
- NaN
- NaN
- Weed Technology
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0890-037X
-
-
- 16646
- x
- NaN
- 1475-3138
- NaN
- NaN
- World Trade Review
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1474-7456
-
-
- 16647
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
-
-
-
-
16648 rows × 15 columns
-
-
-
-
-
-```python
-# cummuler les issns pour le merge
-# rp_1 = rp.loc[rp['issnl'].notna()][['issnl', 'article_version', 'license', 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP']]
-# rp_1 = rp_1.rename(columns = {'issnl' : 'issn'})
-# rp_2 = rp.loc[rp['issn'].notna()][['issn', 'article_version', 'license', 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP']]
-# rp_all = rp_1.append(rp_2, ignore_index=True)
-rp_all = rp
-```
-
-
-```python
-# ajouter les champs manquants
-# valeur discount (id 2) à 100% pour les licences read & publish
-# elsevier['amount'] = 100
-# elsevier['symbol'] = '%'
-# elsevier['cost_factor_type'] = 2
-# elsevier['comment'] = 'Source: swissuniversities'
-# elsevier
-```
-
-
-```python
-# merge avec les organisations
-# 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP'
-participants_elsevier = participants.loc[participants['Elsevier'].notna()][['Elsevier', 'ROR']]
-rp_elsevier = rp_all.loc[rp_all['Elsevier'].notna()]
-rp_1 = pd.merge(rp_elsevier, participants_elsevier, on='Elsevier', how='outer')
-rp_1
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
-
-
-
-
- 0
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/04d8ztx87
-
-
- 1
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/02bnkt322
-
-
- 2
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/00zg4za48
-
-
- 3
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/02s376052
-
-
- 4
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/05a28rw58
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 197555
- NaN
- x
- 0944-2006
- NaN
- NaN
- Zoology
- NaN
- NaN
- True
- published
- 0
- cc_by_nc_nd
- 2020-01-01
- 2023-12-31
- 0944-2006
- https://ror.org/01swzsf04
-
-
- 197556
- NaN
- x
- 0944-2006
- NaN
- NaN
- Zoology
- NaN
- NaN
- True
- published
- 0
- cc_by_nc_nd
- 2020-01-01
- 2023-12-31
- 0944-2006
- https://ror.org/019whta54
-
-
- 197557
- NaN
- x
- 0944-2006
- NaN
- NaN
- Zoology
- NaN
- NaN
- True
- published
- 0
- cc_by_nc_nd
- 2020-01-01
- 2023-12-31
- 0944-2006
- https://ror.org/00vasag41
-
-
- 197558
- NaN
- x
- 0944-2006
- NaN
- NaN
- Zoology
- NaN
- NaN
- True
- published
- 0
- cc_by_nc_nd
- 2020-01-01
- 2023-12-31
- 0944-2006
- https://ror.org/05r0ap620
-
-
- 197559
- NaN
- x
- 0944-2006
- NaN
- NaN
- Zoology
- NaN
- NaN
- True
- published
- 0
- cc_by_nc_nd
- 2020-01-01
- 2023-12-31
- 0944-2006
- https://ror.org/05pmsvm27
-
-
-
-
197560 rows × 16 columns
-
-
-
-
-
-```python
-rp_elsevier
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
-
-
-
-
- 0
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
-
-
- 1
- NaN
- x
- 0001-4575
- NaN
- NaN
- Accident Analysis and Prevention
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 0001-4575
-
-
- 2
- NaN
- x
- 0361-3682
- NaN
- NaN
- Accounting, Organizations and Society
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 0361-3682
-
-
- 3
- NaN
- x
- 0094-5765
- NaN
- NaN
- Acta Astronautica
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 0094-5765
-
-
- 4
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 4485
- NaN
- x
- 2213-9095
- NaN
- NaN
- Wound Medicine
- NaN
- NaN
- True
- published
- 0
- cc_by_nc_nd
- 2020-01-01
- 2023-12-31
- 2213-9095
-
-
- 4486
- NaN
- x
- 1865-9217
- NaN
- NaN
- Zeitschrift fuer Evidenz, Fortbildung und Qual...
- NaN
- NaN
- True
- published
- 0
- cc_by_nc_nd
- 2020-01-01
- 2023-12-31
- 1865-9217
-
-
- 4487
- NaN
- x
- 0939-3889
- NaN
- NaN
- Zeitschrift fuer Medizinische Physik
- NaN
- NaN
- True
- published
- 0
- cc_by_nc_nd
- 2020-01-01
- 2023-12-31
- 0939-3889
-
-
- 4488
- NaN
- x
- 0044-5231
- NaN
- NaN
- Zoologischer Anzeiger
- NaN
- NaN
- True
- published
- 0
- cc_by_nc_nd
- 2020-01-01
- 2023-12-31
- 0044-5231
-
-
- 4489
- NaN
- x
- 0944-2006
- NaN
- NaN
- Zoology
- NaN
- NaN
- True
- published
- 0
- cc_by_nc_nd
- 2020-01-01
- 2023-12-31
- 0944-2006
-
-
-
-
4490 rows × 15 columns
-
-
-
-
-
-```python
-participants_elsevier
-```
-
-
-
-
-
-
-
-
-
-
- Elsevier
- ROR
-
-
-
-
- 0
- x
- https://ror.org/04d8ztx87
-
-
- 1
- x
- https://ror.org/02bnkt322
-
-
- 3
- x
- https://ror.org/00zg4za48
-
-
- 4
- x
- https://ror.org/02s376052
-
-
- 5
- x
- https://ror.org/05a28rw58
-
-
- 6
- x
- https://ror.org/032ymzc07
-
-
- 7
- x
- https://ror.org/04mq2g308
-
-
- 8
- x
- https://ror.org/0210tb741
-
-
- 9
- x
- https://ror.org/007ygn379
-
-
- 10
- x
- https://ror.org/01xkakk17
-
-
- 11
- x
- https://ror.org/015pmkr43
-
-
- 12
- x
- https://ror.org/048gre751
-
-
- 13
- x
- https://ror.org/01bvm0h13
-
-
- 14
- x
- https://ror.org/02ejkey04
-
-
- 15
- x
- https://ror.org/04nd0xd48
-
-
- 16
- x
- https://ror.org/00w9q2c06
-
-
- 17
- x
- https://ror.org/049c2kr37
-
-
- 20
- x
- https://ror.org/00p9jf779
-
-
- 21
- x
- https://ror.org/038mj2660
-
-
- 22
- x
- https://ror.org/01awgk221
-
-
- 23
- x
- https://ror.org/05jf1ma54
-
-
- 24
- x
- https://ror.org/02fjgft97
-
-
- 25
- x
- https://ror.org/0235ynq74
-
-
- 26
- x
- https://ror.org/03fs41j10
-
-
- 27
- x
- https://ror.org/00rqdn375
-
-
- 28
- x
- https://ror.org/05m37v666
-
-
- 29
- x
- https://ror.org/04bf6dq94
-
-
- 30
- x
- https://ror.org/040gs8e06
-
-
- 31
- x
- https://ror.org/05ghhx264
-
-
- 32
- x
- https://ror.org/03mcsbr76
-
-
- 33
- x
- https://ror.org/05ep8g269
-
-
- 34
- x
- https://ror.org/03c4atk17
-
-
- 35
- x
- https://ror.org/02s6k3f65
-
-
- 36
- x
- https://ror.org/02k7v4d05
-
-
- 37
- x
- https://ror.org/01qjrx392
-
-
- 38
- x
- https://ror.org/00kgrkn83
-
-
- 39
- x
- https://ror.org/0561a3s31
-
-
- 40
- x
- https://ror.org/02crff812
-
-
- 41
- x
- https://ror.org/022fs9h90
-
-
- 42
- x
- https://ror.org/01swzsf04
-
-
- 43
- x
- https://ror.org/019whta54
-
-
- 44
- x
- https://ror.org/00vasag41
-
-
- 45
- x
- https://ror.org/05r0ap620
-
-
- 46
- x
- https://ror.org/05pmsvm27
-
-
-
-
-
-
-
-
-```python
-# merge avec les organisations
-# 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP'
-participants_springer = participants.loc[participants['Springer Nature'].notna()][['Springer Nature', 'ROR']]
-rp_springer = rp_all.loc[rp_all['Springer Nature'].notna()]
-rp_2 = pd.merge(rp_springer, participants_springer, on='Springer Nature', how='outer')
-rp_2
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
-
-
-
-
- 0
- NaN
- NaN
- 2190-5738
- x
- NaN
- 3 Biotech
- https://www.springer.com/journal/13205
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2022-12-31
- 2190-5738
- https://ror.org/04d8ztx87
-
-
- 1
- NaN
- NaN
- 2190-5738
- x
- NaN
- 3 Biotech
- https://www.springer.com/journal/13205
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2022-12-31
- 2190-5738
- https://ror.org/02bnkt322
-
-
- 2
- NaN
- NaN
- 2190-5738
- x
- NaN
- 3 Biotech
- https://www.springer.com/journal/13205
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2022-12-31
- 2190-5738
- https://ror.org/01ggx4157
-
-
- 3
- NaN
- NaN
- 2190-5738
- x
- NaN
- 3 Biotech
- https://www.springer.com/journal/13205
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2022-12-31
- 2190-5738
- https://ror.org/00zg4za48
-
-
- 4
- NaN
- NaN
- 2190-5738
- x
- NaN
- 3 Biotech
- https://www.springer.com/journal/13205
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2022-12-31
- 2190-5738
- https://ror.org/02s376052
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 187675
- NaN
- NaN
- 1432-234X
- x
- NaN
- Zoomorphology
- https://www.springer.com/journal/435
- NaN
- True
- published
- 0
- cc_by_nc
- 2020-01-01
- 2022-12-31
- 0720-213X
- https://ror.org/01swzsf04
-
-
- 187676
- NaN
- NaN
- 1432-234X
- x
- NaN
- Zoomorphology
- https://www.springer.com/journal/435
- NaN
- True
- published
- 0
- cc_by_nc
- 2020-01-01
- 2022-12-31
- 0720-213X
- https://ror.org/019whta54
-
-
- 187677
- NaN
- NaN
- 1432-234X
- x
- NaN
- Zoomorphology
- https://www.springer.com/journal/435
- NaN
- True
- published
- 0
- cc_by_nc
- 2020-01-01
- 2022-12-31
- 0720-213X
- https://ror.org/00vasag41
-
-
- 187678
- NaN
- NaN
- 1432-234X
- x
- NaN
- Zoomorphology
- https://www.springer.com/journal/435
- NaN
- True
- published
- 0
- cc_by_nc
- 2020-01-01
- 2022-12-31
- 0720-213X
- https://ror.org/05r0ap620
-
-
- 187679
- NaN
- NaN
- 1432-234X
- x
- NaN
- Zoomorphology
- https://www.springer.com/journal/435
- NaN
- True
- published
- 0
- cc_by_nc
- 2020-01-01
- 2022-12-31
- 0720-213X
- https://ror.org/05pmsvm27
-
-
-
-
187680 rows × 16 columns
-
-
-
-
-
-```python
-# merge avec les organisations
-# 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP'
-participants_wiley = participants.loc[participants['Wiley'].notna()][['Wiley', 'ROR']]
-rp_wiley = rp_all.loc[rp_all['Wiley'].notna()]
-rp_3 = pd.merge(rp_wiley, participants_wiley, on='Wiley', how='outer')
-rp_3
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
-
-
-
-
- 0
- NaN
- NaN
- 1467-6281
- NaN
- NaN
- ABACUS
- https://onlinelibrary.wiley.com/journal/14676281
- x
- True
- published
- 0
- cc_by
- 2021-01-01
- 2024-12-31
- 0001-3072
- https://ror.org/04d8ztx87
-
-
- 1
- NaN
- NaN
- 1467-6281
- NaN
- NaN
- ABACUS
- https://onlinelibrary.wiley.com/journal/14676281
- x
- True
- published
- 0
- cc_by
- 2021-01-01
- 2024-12-31
- 0001-3072
- https://ror.org/02bnkt322
-
-
- 2
- NaN
- NaN
- 1467-6281
- NaN
- NaN
- ABACUS
- https://onlinelibrary.wiley.com/journal/14676281
- x
- True
- published
- 0
- cc_by
- 2021-01-01
- 2024-12-31
- 0001-3072
- https://ror.org/01ggx4157
-
-
- 3
- NaN
- NaN
- 1467-6281
- NaN
- NaN
- ABACUS
- https://onlinelibrary.wiley.com/journal/14676281
- x
- True
- published
- 0
- cc_by
- 2021-01-01
- 2024-12-31
- 0001-3072
- https://ror.org/00zg4za48
-
-
- 4
- NaN
- NaN
- 1467-6281
- NaN
- NaN
- ABACUS
- https://onlinelibrary.wiley.com/journal/14676281
- x
- True
- published
- 0
- cc_by
- 2021-01-01
- 2024-12-31
- 0001-3072
- https://ror.org/02s376052
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 188455
- NaN
- NaN
- 1467-9744
- NaN
- NaN
- ZYGON® JOURNAL OF RELIGION AND SCIENCE
- https://onlinelibrary.wiley.com/journal/14679744
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
- 0591-2385
- https://ror.org/01swzsf04
-
-
- 188456
- NaN
- NaN
- 1467-9744
- NaN
- NaN
- ZYGON® JOURNAL OF RELIGION AND SCIENCE
- https://onlinelibrary.wiley.com/journal/14679744
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
- 0591-2385
- https://ror.org/019whta54
-
-
- 188457
- NaN
- NaN
- 1467-9744
- NaN
- NaN
- ZYGON® JOURNAL OF RELIGION AND SCIENCE
- https://onlinelibrary.wiley.com/journal/14679744
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
- 0591-2385
- https://ror.org/00vasag41
-
-
- 188458
- NaN
- NaN
- 1467-9744
- NaN
- NaN
- ZYGON® JOURNAL OF RELIGION AND SCIENCE
- https://onlinelibrary.wiley.com/journal/14679744
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
- 0591-2385
- https://ror.org/05r0ap620
-
-
- 188459
- NaN
- NaN
- 1467-9744
- NaN
- NaN
- ZYGON® JOURNAL OF RELIGION AND SCIENCE
- https://onlinelibrary.wiley.com/journal/14679744
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
- 0591-2385
- https://ror.org/05pmsvm27
-
-
-
-
188460 rows × 16 columns
-
-
-
-
-
-```python
-rp_wiley
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
-
-
-
-
- 8570
- NaN
- NaN
- 1467-6281
- NaN
- NaN
- ABACUS
- https://onlinelibrary.wiley.com/journal/14676281
- x
- True
- published
- 0
- cc_by
- 2021-01-01
- 2024-12-31
- 0001-3072
-
-
- 8571
- NaN
- NaN
- 1553-2712
- NaN
- NaN
- ACADEMIC EMERGENCY MEDICINE
- https://onlinelibrary.wiley.com/journal/15532712
- x
- True
- published
- 0
- cc_by
- 2021-01-01
- 2024-12-31
- 1069-6563
-
-
- 8572
- NaN
- NaN
- 1467-629X
- NaN
- NaN
- ACCOUNTING & FINANCE
- https://onlinelibrary.wiley.com/journal/1467629X
- x
- True
- published
- 0
- cc_by
- 2021-01-01
- 2024-12-31
- 0810-5391
-
-
- 8573
- NaN
- NaN
- 1911-3838
- NaN
- NaN
- ACCOUNTING PERSPECTIVES
- https://onlinelibrary.wiley.com/journal/19113838
- x
- True
- published
- 0
- cc_by
- 2021-01-01
- 2024-12-31
- 1911-382X
-
-
- 8574
- NaN
- NaN
- 1399-6576
- NaN
- NaN
- ACTA ANAESTHESIOLOGICA SCANDINAVICA
- https://onlinelibrary.wiley.com/journal/13996576
- x
- True
- published
- 0
- cc_by
- 2021-01-01
- 2024-12-31
- 0001-5172
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 12753
- NaN
- NaN
- 1521-3749
- NaN
- NaN
- ZEITSCHRIFT FüR ANORGANISCHE UND ALLGEMEINE CH...
- https://onlinelibrary.wiley.com/journal/15213749
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
- 0044-2313
-
-
- 12754
- NaN
- NaN
- 1098-2361
- NaN
- NaN
- ZOO BIOLOGY
- https://onlinelibrary.wiley.com/journal/10982361
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
- 0733-3188
-
-
- 12755
- NaN
- NaN
- 1463-6409
- NaN
- NaN
- ZOOLOGICA SCRIPTA
- https://onlinelibrary.wiley.com/journal/14636409
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
- 0300-3256
-
-
- 12756
- NaN
- NaN
- 1863-2378
- NaN
- NaN
- ZOONOSES AND PUBLIC HEALTH
- https://onlinelibrary.wiley.com/journal/18632378
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
- 1863-1959
-
-
- 12757
- NaN
- NaN
- 1467-9744
- NaN
- NaN
- ZYGON® JOURNAL OF RELIGION AND SCIENCE
- https://onlinelibrary.wiley.com/journal/14679744
- x
- True
- published
- 0
- cc_by_nc_nd
- 2021-01-01
- 2024-12-31
- 0591-2385
-
-
-
-
4188 rows × 15 columns
-
-
-
-
-
-```python
-# merge avec les organisations
-# 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP'
-participants_tf = participants.loc[participants['TF'].notna()][['TF', 'ROR']]
-rp_tf = rp_all.loc[rp_all['TF'].notna()]
-rp_4 = pd.merge(rp_tf, participants_tf, on='TF', how='outer')
-rp_4
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
-
-
-
-
- 0
- NaN
- NaN
- 2151-7290
- NaN
- x
- a/b: Auto/Biography Studies
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
- 0898-9575
- https://ror.org/04d8ztx87
-
-
- 1
- NaN
- NaN
- 2151-7290
- NaN
- x
- a/b: Auto/Biography Studies
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
- 0898-9575
- https://ror.org/02bnkt322
-
-
- 2
- NaN
- NaN
- 2151-7290
- NaN
- x
- a/b: Auto/Biography Studies
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
- 0898-9575
- https://ror.org/01ggx4157
-
-
- 3
- NaN
- NaN
- 2151-7290
- NaN
- x
- a/b: Auto/Biography Studies
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
- 0898-9575
- https://ror.org/00zg4za48
-
-
- 4
- NaN
- NaN
- 2151-7290
- NaN
- x
- a/b: Auto/Biography Studies
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
- 0898-9575
- https://ror.org/02s376052
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 110671
- NaN
- NaN
- 2326-2680
- NaN
- x
- Zoology in the Middle East
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
- 0939-7140
- https://ror.org/01swzsf04
-
-
- 110672
- NaN
- NaN
- 2326-2680
- NaN
- x
- Zoology in the Middle East
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
- 0939-7140
- https://ror.org/019whta54
-
-
- 110673
- NaN
- NaN
- 2326-2680
- NaN
- x
- Zoology in the Middle East
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
- 0939-7140
- https://ror.org/00vasag41
-
-
- 110674
- NaN
- NaN
- 2326-2680
- NaN
- x
- Zoology in the Middle East
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
- 0939-7140
- https://ror.org/05r0ap620
-
-
- 110675
- NaN
- NaN
- 2326-2680
- NaN
- x
- Zoology in the Middle East
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2021-01-01
- 2023-12-31
- 0939-7140
- https://ror.org/05pmsvm27
-
-
-
-
110676 rows × 16 columns
-
-
-
-
-
-```python
-# merge avec les organisations
-# 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP'
-participants_cup = participants.loc[participants['CUP'].notna()][['CUP', 'ROR']]
-rp_cup = rp_all.loc[rp_all['CUP'].notna()]
-rp_5 = pd.merge(rp_cup, participants_cup, on='CUP', how='outer')
-rp_5
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
-
-
-
-
- 0
- x
- NaN
- 2372-2614
- NaN
- NaN
- Agricultural and Resource Economics Review
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by
- 2021-01-01
- 2023-12-31
- 1068-2805
- https://ror.org/04d8ztx87
-
-
- 1
- x
- NaN
- 2372-2614
- NaN
- NaN
- Agricultural and Resource Economics Review
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by
- 2021-01-01
- 2023-12-31
- 1068-2805
- https://ror.org/02bnkt322
-
-
- 2
- x
- NaN
- 2372-2614
- NaN
- NaN
- Agricultural and Resource Economics Review
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by
- 2021-01-01
- 2023-12-31
- 1068-2805
- https://ror.org/01ggx4157
-
-
- 3
- x
- NaN
- 2372-2614
- NaN
- NaN
- Agricultural and Resource Economics Review
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by
- 2021-01-01
- 2023-12-31
- 1068-2805
- https://ror.org/00zg4za48
-
-
- 4
- x
- NaN
- 2372-2614
- NaN
- NaN
- Agricultural and Resource Economics Review
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by
- 2021-01-01
- 2023-12-31
- 1068-2805
- https://ror.org/02s376052
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 68259
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/01swzsf04
-
-
- 68260
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/019whta54
-
-
- 68261
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/00vasag41
-
-
- 68262
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/05r0ap620
-
-
- 68263
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/05pmsvm27
-
-
-
-
68264 rows × 16 columns
-
-
-
-
-
-```python
-# concat des 5
-rp_fin = rp_1.append(rp_2, ignore_index=True)
-rp_fin = rp_fin.append(rp_3, ignore_index=True)
-rp_fin = rp_fin.append(rp_4, ignore_index=True)
-rp_fin = rp_fin.append(rp_5, ignore_index=True)
-rp_fin
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
-
-
-
-
- 0
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/04d8ztx87
-
-
- 1
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/02bnkt322
-
-
- 2
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/00zg4za48
-
-
- 3
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/02s376052
-
-
- 4
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/05a28rw58
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 752635
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/01swzsf04
-
-
- 752636
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/019whta54
-
-
- 752637
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/00vasag41
-
-
- 752638
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/05r0ap620
-
-
- 752639
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/05pmsvm27
-
-
-
-
752640 rows × 16 columns
-
-
-
-
-
-```python
-# supprimer les doublons et les vides
-rp_fin = rp_fin.dropna(subset=['issn'])
-rp_fin = rp_fin.drop_duplicates(subset=['issn', 'license', 'ROR'])
-rp_fin
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
-
-
-
-
- 0
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/04d8ztx87
-
-
- 1
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/02bnkt322
-
-
- 2
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/00zg4za48
-
-
- 3
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/02s376052
-
-
- 4
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/05a28rw58
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 752635
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/01swzsf04
-
-
- 752636
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/019whta54
-
-
- 752637
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/00vasag41
-
-
- 752638
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/05r0ap620
-
-
- 752639
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/05pmsvm27
-
-
-
-
751628 rows × 16 columns
-
-
-
-
-
-```python
-# reindex et ajout de l'id avec l'index + 1
-rp_fin = rp_fin.reset_index()
-del rp_fin['index']
-rp_fin = rp_fin.reset_index()
-rp_fin['rp_id'] = rp_fin.index + 1
-rp_fin
-```
-
-
-
-
-
-
-
-
-
-
- index
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- rp_id
-
-
-
-
- 0
- 0
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/04d8ztx87
- 1
-
-
- 1
- 1
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/02bnkt322
- 2
-
-
- 2
- 2
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/00zg4za48
- 3
-
-
- 3
- 3
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/02s376052
- 4
-
-
- 4
- 4
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/05a28rw58
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 751623
- 751623
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/01swzsf04
- 751624
-
-
- 751624
- 751624
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/019whta54
- 751625
-
-
- 751625
- 751625
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/00vasag41
- 751626
-
-
- 751626
- 751626
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/05r0ap620
- 751627
-
-
- 751627
- 751627
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/05pmsvm27
- 751628
-
-
-
-
751628 rows × 18 columns
-
-
-
-
-
-```python
-rp_fin['embargo_months'].value_counts()
-```
-
-
-
-
- 0 683364
- 60 68264
- Name: embargo_months, dtype: int64
-
-
-
-
-```python
-# test des lignes sans embargo
-rp_fin.loc[rp_fin['embargo_months'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- index
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- rp_id
-
-
-
-
-
-
-
-
-
-
-```python
-issn
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- journal
- issn_type
-
-
-
-
- 0
- 1
- 0001-2815
- 532
- 1
-
-
- 1
- 2
- 1399-0039
- 532
- 2
-
-
- 2
- 3
- 0001-4842
- 498
- 1
-
-
- 3
- 4
- 1520-4898
- 498
- 2
-
-
- 4
- 5
- 0001-4966
- 789
- 1
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 1756
- 2470-0045
- 533
- 3
-
-
- 1756
- 1757
- 2470-0053
- 533
- 2
-
-
- 1757
- 1758
- 2475-9953
- 608
- 2
-
-
- 1758
- 1759
- 2504-4427
- 994
- 1
-
-
- 1759
- 1760
- 2504-4435
- 994
- 3
-
-
-
-
1760 rows × 4 columns
-
-
-
-
-
-```python
-# merge pour avoir l'issnl
-issn = pd.merge(issn, issnl, on='issn', how='left')
-issn
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- journal
- issn_type
- issnl
-
-
-
-
- 0
- 1
- 0001-2815
- 532
- 1
- 0001-2815
-
-
- 1
- 2
- 1399-0039
- 532
- 2
- 0001-2815
-
-
- 2
- 3
- 0001-4842
- 498
- 1
- 0001-4842
-
-
- 3
- 4
- 1520-4898
- 498
- 2
- 0001-4842
-
-
- 4
- 5
- 0001-4966
- 789
- 1
- 0001-4966
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1755
- 1756
- 2470-0045
- 533
- 3
- 2470-0045
-
-
- 1756
- 1757
- 2470-0053
- 533
- 2
- 2470-0045
-
-
- 1757
- 1758
- 2475-9953
- 608
- 2
- 2475-9953
-
-
- 1758
- 1759
- 2504-4427
- 994
- 1
- 2504-4427
-
-
- 1759
- 1760
- 2504-4435
- 994
- 3
- 2504-4427
-
-
-
-
1760 rows × 5 columns
-
-
-
-
-
-```python
-issn.loc[issn['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn
- journal
- issn_type
- issnl
-
-
-
-
-
-
-
-
-
-
-```python
-# merge dans l'autre sens pour garder que les lignes du fichier
-rp_fin = pd.merge(rp_fin, issn[['id', 'journal', 'issnl']], on='issnl', how='left')
-rp_fin
-```
-
-
-
-
-
-
-
-
-
-
- index
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- rp_id
- id
- journal
-
-
-
-
- 0
- 0
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/04d8ztx87
- 1
- NaN
- NaN
-
-
- 1
- 1
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/02bnkt322
- 2
- NaN
- NaN
-
-
- 2
- 2
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/00zg4za48
- 3
- NaN
- NaN
-
-
- 3
- 3
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/02s376052
- 4
- NaN
- NaN
-
-
- 4
- 4
- NaN
- x
- 1876-2859
- NaN
- NaN
- Academic Pediatrics
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1876-2859
- https://ror.org/05a28rw58
- 5
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 792211
- 751623
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/01swzsf04
- 751624
- NaN
- NaN
-
-
- 792212
- 751624
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/019whta54
- 751625
- NaN
- NaN
-
-
- 792213
- 751625
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/00vasag41
- 751626
- NaN
- NaN
-
-
- 792214
- 751626
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/05r0ap620
- 751627
- NaN
- NaN
-
-
- 792215
- 751627
- x
- NaN
- 1469-8730
- NaN
- NaN
- Zygote
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 0967-1994
- https://ror.org/05pmsvm27
- 751628
- NaN
- NaN
-
-
-
-
792216 rows × 20 columns
-
-
-
-
-
-```python
-# test des lignes sans embargo
-rp_fin.loc[rp_fin['embargo_months'].isna() & rp_fin['id'].notna()]
-```
-
-
-
-
-
-
-
-
-
-
- index
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- rp_id
- id
- journal
-
-
-
-
-
-
-
-
-
-
-```python
-# garder les lignes avec merge
-rp_fin_merge = rp_fin.loc[rp_fin['id'].notna()]
-rp_fin_merge
-```
-
-
-
-
-
-
-
-
-
-
- index
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- rp_id
- id
- journal
-
-
-
-
- 176
- 176
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 177
- 1623.0
- 899.0
-
-
- 177
- 176
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 177
- 1624.0
- 899.0
-
-
- 178
- 177
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 178
- 1623.0
- 899.0
-
-
- 179
- 177
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 178
- 1624.0
- 899.0
-
-
- 180
- 178
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 179
- 1623.0
- 899.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 788071
- 747485
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 747486
- 1419.0
- 592.0
-
-
- 788072
- 747486
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 747487
- 1418.0
- 592.0
-
-
- 788073
- 747486
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 747487
- 1419.0
- 592.0
-
-
- 788074
- 747487
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 747488
- 1418.0
- 592.0
-
-
- 788075
- 747487
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 747488
- 1419.0
- 592.0
-
-
-
-
80671 rows × 20 columns
-
-
-
-
-
-```python
-# supprimer les doublons et les vides
-rp_fin_merge = rp_fin_merge.drop_duplicates(subset=['rp_id'])
-rp_fin_merge
-```
-
-
-
-
-
-
-
-
-
-
- index
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- rp_id
- id
- journal
-
-
-
-
- 176
- 176
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 177
- 1623.0
- 899.0
-
-
- 178
- 177
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 178
- 1623.0
- 899.0
-
-
- 180
- 178
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 179
- 1623.0
- 899.0
-
-
- 182
- 179
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 180
- 1623.0
- 899.0
-
-
- 184
- 180
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 181
- 1623.0
- 899.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 788066
- 747483
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 747484
- 1418.0
- 592.0
-
-
- 788068
- 747484
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 747485
- 1418.0
- 592.0
-
-
- 788070
- 747485
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 747486
- 1418.0
- 592.0
-
-
- 788072
- 747486
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 747487
- 1418.0
- 592.0
-
-
- 788074
- 747487
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 747488
- 1418.0
- 592.0
-
-
-
-
40083 rows × 20 columns
-
-
-
-
-
-```python
-# test des lignes sans journal
-rp_fin_merge.loc[rp_fin_merge['journal'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- index
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- rp_id
- id
- journal
-
-
-
-
-
-
-
-
-
-
-```python
-# convertir l'index en id
-del rp_fin_merge['id']
-del rp_fin_merge['index']
-del rp_fin_merge['rp_id']
-rp_fin_merge = rp_fin_merge.reset_index()
-# ajout de l'id avec l'index + 1
-rp_fin_merge['rp_id'] = rp_fin_merge['index'] + 1
-del rp_fin_merge['index']
-rp_fin_merge
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- journal
- rp_id
-
-
-
-
- 0
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 177
-
-
- 1
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 179
-
-
- 2
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 181
-
-
- 3
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 183
-
-
- 4
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 185
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 788067
-
-
- 40079
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 788069
-
-
- 40080
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 788071
-
-
- 40081
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 788073
-
-
- 40082
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 788075
-
-
-
-
40083 rows × 18 columns
-
-
-
-
-
-```python
-# convertir l'index en id
-del rp_fin_merge['rp_id']
-rp_fin_merge = rp_fin_merge.reset_index()
-# ajout de l'id avec l'index + 1
-rp_fin_merge['rp_id'] = rp_fin_merge['index'] + 1
-del rp_fin_merge['index']
-rp_fin_merge
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- journal
- rp_id
-
-
-
-
- 0
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
-
-
- 1
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
-
-
- 2
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
-
-
- 3
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
-
-
- 4
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
-
-
- 40079
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
-
-
- 40080
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
-
-
- 40081
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
-
-
- 40082
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
-
-
-
-
40083 rows × 18 columns
-
-
-
-
-
-```python
-rp_fin_merge['embargo_months'].value_counts()
-```
-
-
-
-
- 0 39163
- 60 920
- Name: embargo_months, dtype: int64
-
-
-
-
-```python
-# test des lignes sans embargo
-rp_fin_merge.loc[rp_fin_merge['embargo_months'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- journal
- rp_id
-
-
-
-
-
-
-
-
-
-
-```python
-# export excel
-rp_fin_merge.to_excel('sample/read_publish_brut_merge.xlsx', index=False)
-```
-
-
-```python
-# export csv
-rp_fin_merge.to_csv('sample/read_publish_brut_merge.tsv', sep='\t', index=False)
-```
diff --git a/import_scripts/09_oacct_read_and_publish.py b/import_scripts/09_oacct_read_and_publish.py
deleted file mode 100644
index 98ff4da0..00000000
--- a/import_scripts/09_oacct_read_and_publish.py
+++ /dev/null
@@ -1,607 +0,0 @@
-#!/usr/bin/env python
-# coding: utf-8
-
-# # Projet Open Access Compliance Check Tool (OACCT)
-#
-# Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-#
-# Ce notebook permet de modifier les données extraites des differentes sources et les exporter dans les tables de l'application OACCT.
-#
-# Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-# Date de dernière mise à jour : 08.09.2021
-
-# In[1]:
-
-
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-# afficher toutes les colonnes
-pd.set_option('display.max_columns', None)
-# definir le debut des ids
-id_start = 1
-
-
-# ## Ajout des rabais pour les revues des licences Read & Publish
-#
-# Journals list by publisher :
-# * https://consortium.ch/elsevier_titlelist_publication
-# * https://consortium.ch/springer_titlelist_publication
-# * https://consortium.ch/wiley_titlelist_publish
-# * https://consortium.ch/tandf_titlelist_publish
-# * https://consortium.ch/sage_titlelist_publish
-# * https://consortium.ch/cup_titlelist_publish
-#
-# Licence term :
-# * Elsevier : 2020-2023
-# * Springer Nature : 2020-2022
-# * Wiley : 2021-2024
-# * Taylor & Francis : 2021-2023
-# * Cambridge University Press (CUP) : 2021-2023
-#
-# CC licences :
-# * Elsevier : CC-BY, CC-BY-NC-ND
-# * Springer Nature : CC-BY, CC-BY-NC
-# * Wiley : CC-BY, CC-BY-NC, CC-BY-NC-ND
-# * Taylor & Francis : CC-BY
-# * Cambridge University Press (CUP) : CC-BY, CC-BY-NC, CC-BY-NC-ND, CC-BY-NC-SA
-#
-# Special conditions :
-# * Cambridge University Press (CUP) : Only the following article types are covered: Research Articles, Review Articles, Rapid Communication, Brief Reports and Case Reports
-#
-#
-
-# ## Import du fichier des issns
-
-# In[2]:
-
-
-issn = pd.read_csv('sample/issn.tsv', encoding='utf-8', header=0, sep='\t')
-issn
-
-
-# In[3]:
-
-
-# open publishers
-publisher = pd.read_csv('sample/publisher.tsv', encoding='utf-8', header=0, sep='\t')
-publisher
-
-
-# In[4]:
-
-
-publisher.loc[publisher['name'] == 'Elsevier']
-
-
-# In[5]:
-
-
-publisher.loc[(publisher['name'] == 'Springer Verlag') | (publisher['name'] == 'Nature Research')]
-
-
-# In[6]:
-
-
-publisher.loc[publisher['name'] == 'Wiley']
-
-
-# In[7]:
-
-
-publisher.loc[publisher['name'] == 'Taylor and Francis']
-
-
-# In[8]:
-
-
-publisher.loc[publisher['name'] == 'Cambridge University Press']
-
-
-# In[9]:
-
-
-# ouvrir la liste d'organisations
-participants = pd.read_csv('agreements/consortium_institutions_participation_read_and_publish.csv', encoding='utf-8', header=0, sep='\t')
-participants
-
-
-# In[10]:
-
-
-# suppression de Lib4RI qui est une bibliothèque
-participants = participants.loc[participants['Institution'] != 'Lib4RI']
-participants
-
-
-# In[11]:
-
-
-# ajout de TF et CUP pour tous (TODO : obtenir la liste des bibliothèques pour ces deux licences)
-participants['TF'] = 'x'
-participants['CUP'] = 'x'
-participants
-
-
-# In[12]:
-
-
-# ouvrir la liste des journaux Elsevier
-elsevier = pd.read_excel('agreements/Elsevier_titlelist_publication.xlsx', skiprows=7)
-elsevier
-
-
-# In[13]:
-
-
-# ajout du champ version
-elsevier['article_version'] = 'published'
-elsevier
-
-
-# In[14]:
-
-
-# ajout des dates
-elsevier['valid_from'] = '2020-01-01'
-elsevier['valid_until'] = '2023-12-31'
-elsevier
-
-
-# In[15]:
-
-
-# ajout du embargo et archiving
-elsevier['embargo_months'] = 0
-elsevier['archiving'] = True
-elsevier
-
-
-# In[16]:
-
-
-elsevier.iloc[elsevier.shape[0]-1]
-
-
-# In[17]:
-
-
-# ajout du champ license
-# cc_by, cc_by_nc_nd
-rp = pd.DataFrame()
-elsevier['article_version'] = 'published'
-elsevier['license'] = 'cc_by'
-elsevier['Elsevier'] = 'x'
-rp = rp.append(elsevier, ignore_index=True)
-elsevier['license'] = 'cc_by_nc_nd'
-rp = rp.append(elsevier, ignore_index=True)
-rp
-
-
-# In[18]:
-
-
-# ouvrir la liste des journaux Springer Nature
-springer = pd.read_excel('agreements/Springer_titlelist_publication.xlsx', skiprows=7)
-springer
-
-
-# In[19]:
-
-
-# ajout du champ license
-# cc_by, cc_by_nc
-springer['article_version'] = 'published'
-springer['license'] = 'cc_by'
-springer['Springer Nature'] = 'x'
-# ajout des dates
-springer['valid_from'] = '2020-01-01'
-springer['valid_until'] = '2022-12-31'
-# ajout du embargo et archiving
-springer['embargo_months'] = 0
-springer['archiving'] = True
-
-
-# In[20]:
-
-
-# append
-rp = rp.append(springer, ignore_index=True)
-springer['license'] = 'cc_by_nc'
-rp = rp.append(springer, ignore_index=True)
-rp
-
-
-# In[21]:
-
-
-# ouvrir la liste des journaux Wiley
-wiley = pd.read_excel('agreements/Wiley_titlelist_publish.xlsx', skiprows=7)
-wiley
-
-
-# In[22]:
-
-
-# ajout du champ license
-# cc_by, cc_by_nc, cc_by_nc_nd
-wiley['article_version'] = 'published'
-wiley['license'] = 'cc_by'
-wiley['Wiley'] = 'x'
-# ajout des dates
-wiley['valid_from'] = '2021-01-01'
-wiley['valid_until'] = '2024-12-31'
-# ajout du embargo et archiving
-wiley['embargo_months'] = 0
-wiley['archiving'] = True
-rp = rp.append(wiley, ignore_index=True)
-# append avec une autre licence
-wiley['license'] = 'cc_by_nc'
-rp = rp.append(wiley, ignore_index=True)
-# append avec une autre licence
-wiley['license'] = 'cc_by_nc_nd'
-rp = rp.append(wiley, ignore_index=True)
-rp
-
-
-# In[23]:
-
-
-# ouvrir la liste des journaux TF
-tf = pd.read_excel('agreements/TandF_titlelist_publish.xlsx', skiprows=7)
-tf
-
-
-# In[24]:
-
-
-# ajout du champ license
-# cc_by, cc_by_nc, cc_by_nc_nd
-tf['article_version'] = 'published'
-tf['license'] = 'cc_by'
-tf['TF'] = 'x'
-# ajout des dates
-tf['valid_from'] = '2021-01-01'
-tf['valid_until'] = '2023-12-31'
-# ajout du embargo et archiving
-tf['embargo_months'] = 0
-tf['archiving'] = True
-
-
-# In[25]:
-
-
-# append
-rp = rp.append(tf, ignore_index=True)
-rp
-
-
-# In[26]:
-
-
-# ouvrir la liste des journaux CUP
-cup = pd.read_excel('agreements/CUP_Journals_titlelist_publish.xlsx', skiprows=7)
-cup
-
-
-# In[27]:
-
-
-# renommer l'ISSN
-cup = cup.rename(columns = {'e-ISSN' : 'ISSN'})
-cup
-
-
-# In[28]:
-
-
-# ajout du champ license
-# cc_by, cc_by_nc, cc_by_nc_nd, cc_by_nc_sa
-cup['article_version'] = 'published'
-cup['license'] = 'cc_by'
-cup['CUP'] = 'x'
-# ajout des dates
-cup['valid_from'] = '2021-01-01'
-cup['valid_until'] = '2023-12-31'
-# ajout du embargo et archiving
-cup['embargo_months'] = 60
-cup['archiving'] = True
-
-
-# In[29]:
-
-
-# append
-rp = rp.append(cup, ignore_index=True)
-cup['license'] = 'cc_by_nc'
-rp = rp.append(cup, ignore_index=True)
-cup['license'] = 'cc_by_nc_nd'
-rp = rp.append(cup, ignore_index=True)
-cup['license'] = 'cc_by_nc_sa'
-rp = rp.append(cup, ignore_index=True)
-rp
-
-
-# In[30]:
-
-
-# test des lignes sans embargo
-rp.loc[rp['embargo_months'].isna()]
-
-
-# In[31]:
-
-
-# ajout des ISSN-L
-issnl = pd.read_csv('issn/20171102.ISSN-to-ISSN-L.txt', encoding='utf-8', header=0, sep='\t')
-issnl
-
-
-# In[32]:
-
-
-# renommer les colonnes
-issnl = issnl.rename(columns={'ISSN' : 'issn', 'ISSN-L' : 'issnl'})
-rp = rp.rename(columns={'ISSN' : 'issn'})
-
-
-# In[33]:
-
-
-# merge
-rp = pd.merge(rp, issnl, on='issn', how='left')
-rp
-
-
-# In[34]:
-
-
-# cummuler les issns pour le merge
-# rp_1 = rp.loc[rp['issnl'].notna()][['issnl', 'article_version', 'license', 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP']]
-# rp_1 = rp_1.rename(columns = {'issnl' : 'issn'})
-# rp_2 = rp.loc[rp['issn'].notna()][['issn', 'article_version', 'license', 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP']]
-# rp_all = rp_1.append(rp_2, ignore_index=True)
-rp_all = rp
-
-
-# In[35]:
-
-
-# ajouter les champs manquants
-# valeur discount (id 2) à 100% pour les licences read & publish
-# elsevier['amount'] = 100
-# elsevier['symbol'] = '%'
-# elsevier['cost_factor_type'] = 2
-# elsevier['comment'] = 'Source: swissuniversities'
-# elsevier
-
-
-# In[36]:
-
-
-# merge avec les organisations
-# 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP'
-participants_elsevier = participants.loc[participants['Elsevier'].notna()][['Elsevier', 'ROR']]
-rp_elsevier = rp_all.loc[rp_all['Elsevier'].notna()]
-rp_1 = pd.merge(rp_elsevier, participants_elsevier, on='Elsevier', how='outer')
-rp_1
-
-
-# In[37]:
-
-
-rp_elsevier
-
-
-# In[38]:
-
-
-participants_elsevier
-
-
-# In[39]:
-
-
-# merge avec les organisations
-# 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP'
-participants_springer = participants.loc[participants['Springer Nature'].notna()][['Springer Nature', 'ROR']]
-rp_springer = rp_all.loc[rp_all['Springer Nature'].notna()]
-rp_2 = pd.merge(rp_springer, participants_springer, on='Springer Nature', how='outer')
-rp_2
-
-
-# In[40]:
-
-
-# merge avec les organisations
-# 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP'
-participants_wiley = participants.loc[participants['Wiley'].notna()][['Wiley', 'ROR']]
-rp_wiley = rp_all.loc[rp_all['Wiley'].notna()]
-rp_3 = pd.merge(rp_wiley, participants_wiley, on='Wiley', how='outer')
-rp_3
-
-
-# In[41]:
-
-
-rp_wiley
-
-
-# In[42]:
-
-
-# merge avec les organisations
-# 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP'
-participants_tf = participants.loc[participants['TF'].notna()][['TF', 'ROR']]
-rp_tf = rp_all.loc[rp_all['TF'].notna()]
-rp_4 = pd.merge(rp_tf, participants_tf, on='TF', how='outer')
-rp_4
-
-
-# In[43]:
-
-
-# merge avec les organisations
-# 'Elsevier', 'Springer Nature', 'Wiley', 'TF', 'CUP'
-participants_cup = participants.loc[participants['CUP'].notna()][['CUP', 'ROR']]
-rp_cup = rp_all.loc[rp_all['CUP'].notna()]
-rp_5 = pd.merge(rp_cup, participants_cup, on='CUP', how='outer')
-rp_5
-
-
-# In[44]:
-
-
-# concat des 5
-rp_fin = rp_1.append(rp_2, ignore_index=True)
-rp_fin = rp_fin.append(rp_3, ignore_index=True)
-rp_fin = rp_fin.append(rp_4, ignore_index=True)
-rp_fin = rp_fin.append(rp_5, ignore_index=True)
-rp_fin
-
-
-# In[45]:
-
-
-# supprimer les doublons et les vides
-rp_fin = rp_fin.dropna(subset=['issn'])
-rp_fin = rp_fin.drop_duplicates(subset=['issn', 'license', 'ROR'])
-rp_fin
-
-
-# In[46]:
-
-
-# reindex et ajout de l'id avec l'index + 1
-rp_fin = rp_fin.reset_index()
-del rp_fin['index']
-rp_fin = rp_fin.reset_index()
-rp_fin['rp_id'] = rp_fin.index + 1
-rp_fin
-
-
-# In[47]:
-
-
-rp_fin['embargo_months'].value_counts()
-
-
-# In[48]:
-
-
-# test des lignes sans embargo
-rp_fin.loc[rp_fin['embargo_months'].isna()]
-
-
-# In[49]:
-
-
-issn
-
-
-# In[50]:
-
-
-# merge pour avoir l'issnl
-issn = pd.merge(issn, issnl, on='issn', how='left')
-issn
-
-
-# In[51]:
-
-
-issn.loc[issn['issnl'].isna()]
-
-
-# In[52]:
-
-
-# merge dans l'autre sens pour garder que les lignes du fichier
-rp_fin = pd.merge(rp_fin, issn[['id', 'journal', 'issnl']], on='issnl', how='left')
-rp_fin
-
-
-# In[53]:
-
-
-# test des lignes sans embargo
-rp_fin.loc[rp_fin['embargo_months'].isna() & rp_fin['id'].notna()]
-
-
-# In[54]:
-
-
-# garder les lignes avec merge
-rp_fin_merge = rp_fin.loc[rp_fin['id'].notna()]
-rp_fin_merge
-
-
-# In[55]:
-
-
-# supprimer les doublons et les vides
-rp_fin_merge = rp_fin_merge.drop_duplicates(subset=['rp_id'])
-rp_fin_merge
-
-
-# In[56]:
-
-
-# test des lignes sans journal
-rp_fin_merge.loc[rp_fin_merge['journal'].isna()]
-
-
-# In[57]:
-
-
-# convertir l'index en id
-del rp_fin_merge['id']
-del rp_fin_merge['index']
-del rp_fin_merge['rp_id']
-rp_fin_merge = rp_fin_merge.reset_index()
-# ajout de l'id avec l'index + 1
-rp_fin_merge['rp_id'] = rp_fin_merge['index'] + 1
-del rp_fin_merge['index']
-rp_fin_merge
-
-
-# In[58]:
-
-
-# convertir l'index en id
-del rp_fin_merge['rp_id']
-rp_fin_merge = rp_fin_merge.reset_index()
-# ajout de l'id avec l'index + 1
-rp_fin_merge['rp_id'] = rp_fin_merge['index'] + 1
-del rp_fin_merge['index']
-rp_fin_merge
-
-
-# In[59]:
-
-
-rp_fin_merge['embargo_months'].value_counts()
-
-
-# In[60]:
-
-
-# test des lignes sans embargo
-rp_fin_merge.loc[rp_fin_merge['embargo_months'].isna()]
-
-
-# In[61]:
-
-
-# export excel
-rp_fin_merge.to_excel('sample/read_publish_brut_merge.xlsx', index=False)
-
-
-# In[62]:
-
-
-# export csv
-rp_fin_merge.to_csv('sample/read_publish_brut_merge.tsv', sep='\t', index=False)
-
diff --git a/import_scripts/10_oacct_terms.md b/import_scripts/10_oacct_terms.md
deleted file mode 100644
index 9b95fd74..00000000
--- a/import_scripts/10_oacct_terms.md
+++ /dev/null
@@ -1,39541 +0,0 @@
-# Projet Open Access Compliance Check Tool (OACCT)
-
-Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-
-Ce notebook permet de modifier les données extraites des differentes sources et les exporter dans les tables de l'application OACCT.
-
-Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-Date de dernière mise à jour : 08.09.2021
-
-
-```python
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-# afficher toutes les colonnes
-pd.set_option('display.max_columns', None)
-# definir le debut des ids
-id_start = 1
-```
-
-## Import du fichier extrait de Sherpa
-
-
-```python
-sherpa = pd.read_csv('sample/sherpa_policies_brut.tsv', encoding='utf-8', header=0, sep='\t')
-sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- license
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
- NaN
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
-
-
-
-
8595 rows × 29 columns
-
-
-
-
-
-```python
-# test des valeurs pour les versions
-sherpa['article_version'].value_counts()
-```
-
-
-
-
- published 4688
- accepted 3251
- submitted 656
- Name: article_version, dtype: int64
-
-
-
-
-```python
-# test des valeurs pour les issns
-sherpa.loc[sherpa['issn'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- license
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
-
-
-
-
-
-
-
-
-
-
-```python
-# ajout des ISSN-L
-issns = pd.read_csv('issn/20171102.ISSN-to-ISSN-L.txt', encoding='utf-8', header=0, sep='\t')
-issns
-```
-
-
-
-
-
-
-
-
-
-
- ISSN
- ISSN-L
-
-
-
-
- 0
- 0000-0019
- 0000-0019
-
-
- 1
- 0000-0027
- 0000-0027
-
-
- 2
- 0000-0043
- 0000-0043
-
-
- 3
- 0000-0051
- 0000-0051
-
-
- 4
- 0000-006X
- 0000-006X
-
-
- ...
- ...
- ...
-
-
- 1995913
- 8756-9957
- 8756-9957
-
-
- 1995914
- 8756-9965
- 8756-9965
-
-
- 1995915
- 8756-9973
- 8756-9973
-
-
- 1995916
- 8756-9981
- 8756-9981
-
-
- 1995917
- 8756-999X
- 8756-999X
-
-
-
-
1995918 rows × 2 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-issns = issns.rename(columns={'ISSN' : 'issn', 'ISSN-L' : 'issnl'})
-issns
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
-
-
-
-
- 0
- 0000-0019
- 0000-0019
-
-
- 1
- 0000-0027
- 0000-0027
-
-
- 2
- 0000-0043
- 0000-0043
-
-
- 3
- 0000-0051
- 0000-0051
-
-
- 4
- 0000-006X
- 0000-006X
-
-
- ...
- ...
- ...
-
-
- 1995913
- 8756-9957
- 8756-9957
-
-
- 1995914
- 8756-9965
- 8756-9965
-
-
- 1995915
- 8756-9973
- 8756-9973
-
-
- 1995916
- 8756-9981
- 8756-9981
-
-
- 1995917
- 8756-999X
- 8756-999X
-
-
-
-
1995918 rows × 2 columns
-
-
-
-
-
-```python
-# merge avec la table sherpa
-sherpa = pd.merge(sherpa, issns, on='issn', how='left')
-sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- license
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
- issnl
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
- 0001-2815
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
- NaN
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
- 0001-2815
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
- 0001-2815
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
- 0001-2815
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
- 0001-4842
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
- 2475-9953
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
- 2475-9953
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
- 2475-9953
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
- 2475-9953
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
- 2475-9953
-
-
-
-
8595 rows × 30 columns
-
-
-
-
-
-```python
-# test des valeurs pour les issnl
-sherpa.loc[sherpa['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- license
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
- issnl
-
-
-
-
-
-
-
-
-
-
-```python
-# extraction des données IR Archiving + Embargo par ISSN
-sherpa_ir = sherpa[['issnl', ]]
-```
-
-## Import du fichier des licences Read & Publish
-
-
-```python
-rp = pd.read_csv('sample/read_publish_brut_merge.tsv', encoding='utf-8', header=0, sep='\t')
-rp
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\interactiveshell.py:3058: DtypeWarning: Columns (0,1,3,4) have mixed types. Specify dtype option on import or set low_memory=False.
- interactivity=interactivity, compiler=compiler, result=result)
-
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- journal
- rp_id
-
-
-
-
- 0
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
-
-
- 1
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
-
-
- 2
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
-
-
- 3
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
-
-
- 4
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
-
-
- 40079
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
-
-
- 40080
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
-
-
- 40081
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
-
-
- 40082
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
-
-
-
-
40083 rows × 18 columns
-
-
-
-
-
-```python
-rp['embargo_months'].value_counts()
-```
-
-
-
-
- 0 39163
- 60 920
- Name: embargo_months, dtype: int64
-
-
-
-
-```python
-# ajout de l'éditeur dans un seul champ
-# rp.loc[rp['Elsevier'] == 'x', 'public_notes'] = 'Elsevier Read & Publish agreement'
-rp.loc[rp['Elsevier'] == 'x', 'rp_publisher'] = 'Elsevier'
-rp.loc[rp['Springer Nature'] == 'x', 'rp_publisher'] = 'Springer Nature'
-rp.loc[rp['Wiley'] == 'x', 'rp_publisher'] = 'Wiley'
-rp.loc[rp['TF'] == 'x', 'rp_publisher'] = 'TF'
-rp.loc[rp['CUP'] == 'x', 'rp_publisher'] = 'CUP'
-rp
-```
-
-
-
-
-
-
-
-
-
-
- CUP
- Elsevier
- issn
- Springer Nature
- TF
- Title
- URL
- Wiley
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- journal
- rp_id
- rp_publisher
-
-
-
-
- 0
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
- Elsevier
-
-
- 1
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
- Elsevier
-
-
- 2
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
- Elsevier
-
-
- 3
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
- Elsevier
-
-
- 4
- NaN
- x
- 1742-7061
- NaN
- NaN
- Acta Biomaterialia
- NaN
- NaN
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
- Elsevier
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
- CUP
-
-
- 40079
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
- CUP
-
-
- 40080
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
- CUP
-
-
- 40081
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
- CUP
-
-
- 40082
- x
- NaN
- 1435-8115
- NaN
- NaN
- Microscopy and Microanalysis
- http://www.cambridge.org/core/product/identifi...
- NaN
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
- CUP
-
-
-
-
40083 rows × 19 columns
-
-
-
-
-
-```python
-# test des valeurs pour les versions
-rp['rp_publisher'].value_counts()
-```
-
-
-
-
- Elsevier 18128
- Wiley 13905
- Springer Nature 6716
- CUP 920
- TF 414
- Name: rp_publisher, dtype: int64
-
-
-
-
-```python
-# test des valeurs pour les versions
-rp['license'].value_counts()
-```
-
-
-
-
- cc_by 17701
- cc_by_nc_nd 13929
- cc_by_nc 8223
- cc_by_nc_sa 230
- Name: license, dtype: int64
-
-
-
-
-```python
-# supprimer les champs inutiles et renommer les colonnes
-del rp['Elsevier']
-del rp['Springer Nature']
-del rp['Wiley']
-del rp['TF']
-del rp['CUP']
-del rp['URL']
-rp
-```
-
-
-
-
-
-
-
-
-
-
- issn
- Title
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ROR
- journal
- rp_id
- rp_publisher
-
-
-
-
- 0
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
- Elsevier
-
-
- 1
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
- Elsevier
-
-
- 2
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
- Elsevier
-
-
- 3
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
- Elsevier
-
-
- 4
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
- Elsevier
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
- CUP
-
-
- 40079
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
- CUP
-
-
- 40080
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
- CUP
-
-
- 40081
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
- CUP
-
-
- 40082
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
- CUP
-
-
-
-
40083 rows × 13 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-rp = rp.rename(columns = {'Title' : 'title', 'ROR' : 'ror', 'read_publish_id' : 'rp_id'})
-rp
-```
-
-
-
-
-
-
-
-
-
-
- issn
- title
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ror
- journal
- rp_id
- rp_publisher
-
-
-
-
- 0
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
- Elsevier
-
-
- 1
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
- Elsevier
-
-
- 2
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
- Elsevier
-
-
- 3
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
- Elsevier
-
-
- 4
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
- Elsevier
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
- CUP
-
-
- 40079
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
- CUP
-
-
- 40080
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
- CUP
-
-
- 40081
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
- CUP
-
-
- 40082
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
- CUP
-
-
-
-
40083 rows × 13 columns
-
-
-
-
-## Table applicable_version
-
-
-```python
-# creation du DF
-col_names = ['id',
- 'type',
- 'description'
- ]
-applicable_version = pd.DataFrame(columns = col_names)
-# 3 values : published, accepted, submitted
-new_row1 = {'id':1, 'type':'submitted', 'description' : 'Submitted version'}
-new_row2 = {'id':2, 'type':'accepted', 'description' : 'Accepted version'}
-new_row3 = {'id':3, 'type':'published', 'description' : 'Published version'}
-#append row to the dataframe
-applicable_version = applicable_version.append(new_row1, ignore_index=True)
-applicable_version = applicable_version.append(new_row2, ignore_index=True)
-applicable_version = applicable_version.append(new_row3, ignore_index=True)
-applicable_version
-```
-
-
-
-
-
-
-
-
-
-
- id
- type
- description
-
-
-
-
- 0
- 1
- submitted
- Submitted version
-
-
- 1
- 2
- accepted
- Accepted version
-
-
- 2
- 3
- published
- Published version
-
-
-
-
-
-
-
-
-```python
-# ajout de la valeur UNKNOWN
-applicable_version = applicable_version.append({'id' : 999999, 'type' : 'UNKNOWN', 'description' : 'UNKNOWN'}, ignore_index=True)
-applicable_version
-```
-
-
-
-
-
-
-
-
-
-
- id
- type
- description
-
-
-
-
- 0
- 1
- submitted
- Submitted version
-
-
- 1
- 2
- accepted
- Accepted version
-
-
- 2
- 3
- published
- Published version
-
-
- 3
- 999999
- UNKNOWN
- UNKNOWN
-
-
-
-
-
-
-
-
-```python
-# renommage des champs finaux
-applicable_version_export = applicable_version[['id', 'description']]
-```
-
-
-```python
-# export de la table applicable_version
-result = applicable_version_export.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/version.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-applicable_version_export.to_csv('sample/version.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-applicable_version_export.to_excel('sample/version.xlsx', index=False)
-```
-
-
-```python
-# merge avec la table sherpa
-sherpa = pd.merge(sherpa, applicable_version[['id', 'type']], left_on='article_version', right_on='type', how='left')
-sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- license
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id_x
- issnl
- id_y
- type
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
- 0001-2815
- 1
- submitted
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
- NaN
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
- 0001-2815
- 2
- accepted
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
- 0001-2815
- 3
- published
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
- 0001-2815
- 3
- published
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
- 0001-4842
- 1
- submitted
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
- 2475-9953
- 1
- submitted
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
- 2475-9953
- 2
- accepted
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
- 2475-9953
- 3
- published
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
- 2475-9953
- 3
- published
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
- 2475-9953
- 3
- published
-
-
-
-
8595 rows × 32 columns
-
-
-
-
-
-```python
-sherpa = sherpa.rename(columns = {'id_x' : 'id', 'id_y' : 'version'})
-del sherpa['type']
-sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- license
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
- issnl
- version
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
- 0001-2815
- 1
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
- NaN
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
- 0001-2815
- 2
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
- 0001-2815
- 3
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
- 0001-2815
- 3
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
- 0001-4842
- 1
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
- 2475-9953
- 1
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
- 2475-9953
- 2
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
- 2475-9953
- 3
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
- 2475-9953
- 3
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
- 2475-9953
- 3
-
-
-
-
8595 rows × 31 columns
-
-
-
-
-
-```python
-# merge avec la table read & publish
-rp = pd.merge(rp, applicable_version[['id', 'type']], left_on='article_version', right_on='type', how='left')
-rp
-```
-
-
-
-
-
-
-
-
-
-
- issn
- title
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ror
- journal
- rp_id
- rp_publisher
- id
- type
-
-
-
-
- 0
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
- Elsevier
- 3
- published
-
-
- 1
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
- Elsevier
- 3
- published
-
-
- 2
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
- Elsevier
- 3
- published
-
-
- 3
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
- Elsevier
- 3
- published
-
-
- 4
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
- Elsevier
- 3
- published
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
- CUP
- 3
- published
-
-
- 40079
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
- CUP
- 3
- published
-
-
- 40080
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
- CUP
- 3
- published
-
-
- 40081
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
- CUP
- 3
- published
-
-
- 40082
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
- CUP
- 3
- published
-
-
-
-
40083 rows × 15 columns
-
-
-
-
-
-```python
-rp = rp.rename(columns = {'id' : 'version'})
-del rp['type']
-rp
-```
-
-
-
-
-
-
-
-
-
-
- issn
- title
- archiving
- article_version
- embargo_months
- license
- valid_from
- valid_until
- issnl
- ror
- journal
- rp_id
- rp_publisher
- version
-
-
-
-
- 0
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
- Elsevier
- 3
-
-
- 1
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
- Elsevier
- 3
-
-
- 2
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
- Elsevier
- 3
-
-
- 3
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
- Elsevier
- 3
-
-
- 4
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
- Elsevier
- 3
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
- CUP
- 3
-
-
- 40079
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
- CUP
- 3
-
-
- 40080
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
- CUP
- 3
-
-
- 40081
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
- CUP
- 3
-
-
- 40082
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
- CUP
- 3
-
-
-
-
40083 rows × 14 columns
-
-
-
-
-## Table oa_licence
-
-
-```python
-# creation du DF
-# 'version' n'est pas utilisée, on dédoublonne par nom sans la version
-col_names = ['id',
- 'name',
- 'url'
- ]
-oa_licence = pd.DataFrame(columns = col_names)
-oa_licence
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- url
-
-
-
-
-
-
-
-
-
-
-```python
-# export des licences
-sherpa['license'].value_counts()
-```
-
-
-
-
- cc_by 4151
- cc_by_nc_nd 2338
- cc_by_nc 559
- bespoke_license 47
- cc_by_nc_sa 20
- cc_by_nd 7
- cc_by_sa 4
- cc0 3
- all_rights_reserved 1
- Name: license, dtype: int64
-
-
-
-
-```python
-sherpa_licences = sherpa['license'].drop_duplicates()
-sherpa_licences = sherpa_licences.dropna()
-sherpa_licences
-```
-
-
-
-
- 2 cc_by
- 3 cc_by_nc_nd
- 8 bespoke_license
- 29 cc_by_nc
- 425 cc_by_nc_sa
- 443 all_rights_reserved
- 2147 cc_by_sa
- 2148 cc_by_nd
- 8420 cc0
- Name: license, dtype: object
-
-
-
-
-```python
-oa_licence['sherpa_code'] = np.nan
-oa_licence
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- url
- sherpa_code
-
-
-
-
-
-
-
-
-
-
-```python
-for code in sherpa_licences:
- print (code)
- oa_licence = oa_licence.append({'sherpa_code' : code}, ignore_index=True)
-```
-
- cc_by
- cc_by_nc_nd
- bespoke_license
- cc_by_nc
- cc_by_nc_sa
- all_rights_reserved
- cc_by_sa
- cc_by_nd
- cc0
-
-
-
-```python
-oa_licence
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- url
- sherpa_code
-
-
-
-
- 0
- NaN
- NaN
- NaN
- cc_by
-
-
- 1
- NaN
- NaN
- NaN
- cc_by_nc_nd
-
-
- 2
- NaN
- NaN
- NaN
- bespoke_license
-
-
- 3
- NaN
- NaN
- NaN
- cc_by_nc
-
-
- 4
- NaN
- NaN
- NaN
- cc_by_nc_sa
-
-
- 5
- NaN
- NaN
- NaN
- all_rights_reserved
-
-
- 6
- NaN
- NaN
- NaN
- cc_by_sa
-
-
- 7
- NaN
- NaN
- NaN
- cc_by_nd
-
-
- 8
- NaN
- NaN
- NaN
- cc0
-
-
-
-
-
-
-
-
-```python
-# convertir l'index en id
-oa_licence = oa_licence.reset_index()
-# ajout de l'id avec l'index + 1
-oa_licence['id'] = oa_licence['index'] + 1
-del oa_licence['index']
-oa_licence
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- url
- sherpa_code
-
-
-
-
- 0
- 1
- NaN
- NaN
- cc_by
-
-
- 1
- 2
- NaN
- NaN
- cc_by_nc_nd
-
-
- 2
- 3
- NaN
- NaN
- bespoke_license
-
-
- 3
- 4
- NaN
- NaN
- cc_by_nc
-
-
- 4
- 5
- NaN
- NaN
- cc_by_nc_sa
-
-
- 5
- 6
- NaN
- NaN
- all_rights_reserved
-
-
- 6
- 7
- NaN
- NaN
- cc_by_sa
-
-
- 7
- 8
- NaN
- NaN
- cc_by_nd
-
-
- 8
- 9
- NaN
- NaN
- cc0
-
-
-
-
-
-
-
-
-```python
-# ajout du nom et des URLs
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by', 'name'] = 'CC BY'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by', 'url'] = 'https://creativecommons.org/licenses/by/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_sa', 'name'] = 'CC BY-SA'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_sa', 'url'] = 'https://creativecommons.org/licenses/by-sa/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc', 'name'] = 'CC BY-NC'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc', 'url'] = 'https://creativecommons.org/licenses/by-nc/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc_sa', 'name'] = 'CC BY-NC-SA'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc_sa', 'url'] = 'https://creativecommons.org/licenses/by-nc-sa/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nd', 'name'] = 'CC BY-ND'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nd', 'url'] = 'https://creativecommons.org/licenses/by-nd/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc_nd', 'name'] = 'CC BY-NC-ND'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc_nd', 'url'] = 'https://creativecommons.org/licenses/by-nc-nd/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc0', 'name'] = 'CC0'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc0', 'url'] = 'https://creativecommons.org/publicdomain/zero/1.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'bespoke_license', 'name'] = 'Specific license'
-oa_licence.loc[oa_licence['sherpa_code'] == 'bespoke_license', 'url'] = ''
-oa_licence.loc[oa_licence['sherpa_code'] == 'all_rights_reserved', 'name'] = 'All rights reserved'
-oa_licence.loc[oa_licence['sherpa_code'] == 'all_rights_reserved', 'url'] = ''
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_gnu_gpl', 'name'] = 'GNU GPL'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_gnu_gpl', 'url'] = 'http://gnugpl.org/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_public_domain', 'name'] = 'Public domain'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_public_domain', 'url'] = 'https://creativecommons.org/share-your-work/public-domain/'
-# oa_licence.loc[oa_licence['sherpa_code'] == 'bespoke_license', 'url'] = 'https://port.sas.ac.uk/mod/book/view.php?id=1340&chapterid=1003'
-oa_licence
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- url
- sherpa_code
-
-
-
-
- 0
- 1
- CC BY
- https://creativecommons.org/licenses/by/4.0/
- cc_by
-
-
- 1
- 2
- CC BY-NC-ND
- https://creativecommons.org/licenses/by-nc-nd/...
- cc_by_nc_nd
-
-
- 2
- 3
- Specific license
-
- bespoke_license
-
-
- 3
- 4
- CC BY-NC
- https://creativecommons.org/licenses/by-nc/4.0/
- cc_by_nc
-
-
- 4
- 5
- CC BY-NC-SA
- https://creativecommons.org/licenses/by-nc-sa/...
- cc_by_nc_sa
-
-
- 5
- 6
- All rights reserved
-
- all_rights_reserved
-
-
- 6
- 7
- CC BY-SA
- https://creativecommons.org/licenses/by-sa/4.0/
- cc_by_sa
-
-
- 7
- 8
- CC BY-ND
- https://creativecommons.org/licenses/by-nd/4.0/
- cc_by_nd
-
-
- 8
- 9
- CC0
- https://creativecommons.org/publicdomain/zero/...
- cc0
-
-
-
-
-
-
-
-
-```python
-# ajout de la valeur UNKNOWN
-oa_licence = oa_licence.append({'id' : 999999, 'sherpa_code' : '___', 'name' : 'UNKNOWN', 'url' : ''}, ignore_index=True)
-oa_licence
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
- url
- sherpa_code
-
-
-
-
- 0
- 1
- CC BY
- https://creativecommons.org/licenses/by/4.0/
- cc_by
-
-
- 1
- 2
- CC BY-NC-ND
- https://creativecommons.org/licenses/by-nc-nd/...
- cc_by_nc_nd
-
-
- 2
- 3
- Specific license
-
- bespoke_license
-
-
- 3
- 4
- CC BY-NC
- https://creativecommons.org/licenses/by-nc/4.0/
- cc_by_nc
-
-
- 4
- 5
- CC BY-NC-SA
- https://creativecommons.org/licenses/by-nc-sa/...
- cc_by_nc_sa
-
-
- 5
- 6
- All rights reserved
-
- all_rights_reserved
-
-
- 6
- 7
- CC BY-SA
- https://creativecommons.org/licenses/by-sa/4.0/
- cc_by_sa
-
-
- 7
- 8
- CC BY-ND
- https://creativecommons.org/licenses/by-nd/4.0/
- cc_by_nd
-
-
- 8
- 9
- CC0
- https://creativecommons.org/publicdomain/zero/...
- cc0
-
-
- 9
- 999999
- UNKNOWN
-
- ___
-
-
-
-
-
-
-
-
-```python
-# ajout aux tables sherpa et rp
-sherpa = sherpa.rename(columns = {'license' : 'sherpa_code'})
-sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- sherpa_code
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
- issnl
- version
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
- 0001-2815
- 1
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
- NaN
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
- 0001-2815
- 2
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
- 0001-2815
- 3
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
- 0001-2815
- 3
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
- 0001-4842
- 1
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
- 2475-9953
- 1
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
- 2475-9953
- 2
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
- 2475-9953
- 3
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
- 2475-9953
- 3
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
- 2475-9953
- 3
-
-
-
-
8595 rows × 31 columns
-
-
-
-
-
-```python
-# ajout aux tables sherpa et rp
-rp = rp.rename(columns = {'license' : 'sherpa_code'})
-rp
-```
-
-
-
-
-
-
-
-
-
-
- issn
- title
- archiving
- article_version
- embargo_months
- sherpa_code
- valid_from
- valid_until
- issnl
- ror
- journal
- rp_id
- rp_publisher
- version
-
-
-
-
- 0
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
- Elsevier
- 3
-
-
- 1
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
- Elsevier
- 3
-
-
- 2
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
- Elsevier
- 3
-
-
- 3
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
- Elsevier
- 3
-
-
- 4
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
- Elsevier
- 3
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
- CUP
- 3
-
-
- 40079
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
- CUP
- 3
-
-
- 40080
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
- CUP
- 3
-
-
- 40081
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
- CUP
- 3
-
-
- 40082
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
- CUP
- 3
-
-
-
-
40083 rows × 14 columns
-
-
-
-
-
-```python
-# merge
-sherpa = pd.merge(sherpa, oa_licence[['sherpa_code', 'id']], on='sherpa_code', how='left')
-sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- sherpa_code
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id_x
- issnl
- version
- id_y
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
- 0001-2815
- 1
- NaN
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
- NaN
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
- 0001-2815
- 2
- NaN
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
- 0001-2815
- 3
- 1.0
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
- 0001-2815
- 3
- 2.0
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
- 0001-4842
- 1
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
- 2475-9953
- 1
- NaN
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
- 2475-9953
- 2
- NaN
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
- 2475-9953
- 3
- NaN
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
- 2475-9953
- 3
- 1.0
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
- 2475-9953
- 3
- 1.0
-
-
-
-
8595 rows × 32 columns
-
-
-
-
-
-```python
-sherpa = sherpa.rename(columns = {'id_x' : 'id', 'id_y' : 'licence'})
-sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- sherpa_code
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
- issnl
- version
- licence
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
- 0001-2815
- 1
- NaN
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
- NaN
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
- 0001-2815
- 2
- NaN
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
- 0001-2815
- 3
- 1.0
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
- 0001-2815
- 3
- 2.0
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
- 0001-4842
- 1
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
- 2475-9953
- 1
- NaN
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
- 2475-9953
- 2
- NaN
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
- 2475-9953
- 3
- NaN
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
- 2475-9953
- 3
- 1.0
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
- 2475-9953
- 3
- 1.0
-
-
-
-
8595 rows × 32 columns
-
-
-
-
-
-```python
-# merge
-rp = pd.merge(rp, oa_licence[['sherpa_code', 'id']], on='sherpa_code', how='left')
-rp
-```
-
-
-
-
-
-
-
-
-
-
- issn
- title
- archiving
- article_version
- embargo_months
- sherpa_code
- valid_from
- valid_until
- issnl
- ror
- journal
- rp_id
- rp_publisher
- version
- id
-
-
-
-
- 0
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
- Elsevier
- 3
- 1
-
-
- 1
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
- Elsevier
- 3
- 1
-
-
- 2
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
- Elsevier
- 3
- 1
-
-
- 3
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
- Elsevier
- 3
- 1
-
-
- 4
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
- Elsevier
- 3
- 1
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
- CUP
- 3
- 5
-
-
- 40079
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
- CUP
- 3
- 5
-
-
- 40080
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
- CUP
- 3
- 5
-
-
- 40081
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
- CUP
- 3
- 5
-
-
- 40082
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
- CUP
- 3
- 5
-
-
-
-
40083 rows × 15 columns
-
-
-
-
-
-```python
-rp = rp.rename(columns = {'id' : 'licence'})
-rp
-```
-
-
-
-
-
-
-
-
-
-
- issn
- title
- archiving
- article_version
- embargo_months
- sherpa_code
- valid_from
- valid_until
- issnl
- ror
- journal
- rp_id
- rp_publisher
- version
- licence
-
-
-
-
- 0
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
- Elsevier
- 3
- 1
-
-
- 1
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
- Elsevier
- 3
- 1
-
-
- 2
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
- Elsevier
- 3
- 1
-
-
- 3
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
- Elsevier
- 3
- 1
-
-
- 4
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
- Elsevier
- 3
- 1
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
- CUP
- 3
- 5
-
-
- 40079
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
- CUP
- 3
- 5
-
-
- 40080
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
- CUP
- 3
- 5
-
-
- 40081
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
- CUP
- 3
- 5
-
-
- 40082
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
- CUP
- 3
- 5
-
-
-
-
40083 rows × 15 columns
-
-
-
-
-
-```python
-# renommage des champs finaux
-oa_licence_export = oa_licence[['id', 'name', 'url']]
-oa_licence_export = oa_licence_export.rename(columns={'name' : 'name_or_abbrev', 'url' : 'website'})
-```
-
-
-```python
-# export de la table oa_licence
-result = oa_licence_export.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/licence.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-oa_licence_export.to_csv('sample/licence.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-oa_licence_export.to_excel('sample/licence.xlsx', index=False)
-```
-
-## Table cost_factor_type
-
-
-```python
-# creation du DF
-col_names = ['id',
- 'name'
- ]
-cost_factor_type = pd.DataFrame(columns = col_names)
-cost_factor_type = cost_factor_type.append({'id' : 1, 'name' : 'APC'}, ignore_index=True)
-cost_factor_type = cost_factor_type.append({'id' : 2, 'name' : 'Discount'}, ignore_index=True)
-cost_factor_type = cost_factor_type.append({'id' : 3, 'name' : 'Refund'}, ignore_index=True)
-cost_factor_type
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
-
-
-
-
- 0
- 1
- APC
-
-
- 1
- 2
- Discount
-
-
- 2
- 3
- Refund
-
-
-
-
-
-
-
-
-```python
-# ajout de la valeur UNKNOWN
-cost_factor_type = cost_factor_type.append({'id' : 999999, 'name' : 'UNKNOWN'}, ignore_index=True)
-cost_factor_type
-```
-
-
-
-
-
-
-
-
-
-
- id
- name
-
-
-
-
- 0
- 1
- APC
-
-
- 1
- 2
- Discount
-
-
- 2
- 3
- Refund
-
-
- 3
- 999999
- UNKNOWN
-
-
-
-
-
-
-
-
-```python
-# export de la table
-result = cost_factor_type.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/cost_factor_type.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-cost_factor_type.to_csv('sample/cost_factor_type.tsv', sep='\t', encoding='utf-8', index=False)
-```
-
-
-```python
-# export excel
-cost_factor_type.to_excel('sample/cost_factor_type.xlsx', index=False)
-```
-
-## Table cost_factor
-
-### Ajout des données des APCs depuis DOAJ
-
-
-```python
-# ajout de DOAJ info
-doaj = pd.read_csv('doaj/journalcsv__doaj_20210312_0636_utf8.csv', encoding='utf-8', header=0)
-doaj
-```
-
-
-
-
-
-
-
-
-
-
- Journal title
- Journal URL
- URL in DOAJ
- Alternative title
- Journal ISSN (print version)
- Journal EISSN (online version)
- Keywords
- Languages in which the journal accepts manuscripts
- Publisher
- Country of publisher
- Society or institution
- Country of society or institution
- Journal license
- License attributes
- URL for license terms
- Machine-readable CC licensing information embedded or displayed in articles
- URL to an example page with embedded licensing information
- Author holds copyright without restrictions
- Copyright information URL
- Review process
- Review process information URL
- Journal plagiarism screening policy
- Plagiarism information URL
- URL for journal's aims & scope
- URL for the Editorial Board page
- URL for journal's instructions for authors
- Average number of weeks between article submission and publication
- APC
- APC information URL
- APC amount
- Journal waiver policy (for developing country authors etc)
- Waiver policy information URL
- Has other fees
- Other submission fees information URL
- Preservation Services
- Preservation Service: national library
- Preservation information URL
- Deposit policy directory
- URL for deposit policy
- Persistent article identifiers
- Article metadata includes ORCIDs
- Journal complies with I4OC standards for open citations
- Does this journal allow unrestricted reuse in compliance with BOAI?
- URL for journal's Open Access statement
- Continues
- Continued By
- LCC Codes
- Subjects
- DOAJ Seal
- Added on Date
- Last updated Date
- Number of Article Records
- Most Recent Article Added
-
-
-
-
- 0
- Anais da Academia Brasileira de Ciências
- http://www.scielo.br/scielo.php?script=sci_ser...
- https://doaj.org/toc/ed09859a464f4461b1af34279...
- Annals of the Brazilian Academy of Sciences
- 0001-3765
- 1678-2690
- biological sciences, exact and earth sciences,...
- English
- Academia Brasileira de Ciências
- Brazil
- NaN
- NaN
- CC BY
- NaN
- http://www.scielo.br/revistas/aabc/iaboutj.htm
- Yes
- http://www.scielo.br/scielo.php?script=sci_art...
- No
- NaN
- Peer review
- http://www.scielo.br/revistas/aabc/iinstruc.htm
- Yes
- http://www.scielo.br/revistas/aabc/iinstruc.htm
- http://www.scielo.br/revistas/aabc/iaboutj.htm
- http://www.scielo.br/revistas/aabc/iedboard.htm
- http://www.scielo.br/revistas/aabc/iinstruc.htm
- 18
- No
- http://www.scielo.br/revistas/aabc/iinstruc.htm
- NaN
- No
- NaN
- No
- http://www.scielo.br/revistas/aabc/iinstruc.htm
- NaN
- NaN
- NaN
- NaN
- NaN
- DOI
- NaN
- NaN
- Yes
- http://www.scielo.br/revistas/aabc/isubscrp.htm
- NaN
- NaN
- Q
- Science
- No
- 2004-04-23T21:31:00Z
- 2017-01-04T14:19:54Z
- 2649
- 2020-06-10T21:49:11Z
-
-
- 1
- ACME
- http://riviste.unimi.it/index.php/ACME
- https://doaj.org/toc/b1ca04ba56194f29a362b3eef...
- NaN
- 0001-494X
- 2282-0035
- italian literature, classic literature, lingui...
- Italian
- Università degli Studi di Milano
- Italy
- NaN
- NaN
- CC BY-NC-ND
- NaN
- http://riviste.unimi.it/index.php/ACME/index
- Yes
- http://riviste.unimi.it/index.php/ACME/article...
- Yes
- http://riviste.unimi.it/index.php/ACME/about/e...
- Blind peer review
- https://riviste.unimi.it/index.php/ACME/about
- No
- NaN
- https://riviste.unimi.it/index.php/ACME/about
- https://riviste.unimi.it/index.php/ACME/about/...
- http://riviste.unimi.it/index.php/ACME/about/s...
- 12
- No
- https://riviste.unimi.it/index.php/Lebenswelt/...
- NaN
- No
- NaN
- No
- https://riviste.unimi.it/index.php/Lebenswelt/...
- NaN
- Italian National Library (BNCF)
- http://www.depositolegale.it/
- NaN
- NaN
- DOI, NBN
- NaN
- NaN
- Yes
- http://riviste.unimi.it/index.php/ACME/about/e...
- NaN
- NaN
- A
- General Works
- No
- 2014-12-22T19:55:58Z
- 2020-02-24T09:07:42Z
- 166
- 2020-06-19T09:42:34Z
-
-
- 2
- Acta Dermato-Venereologica
- http://www.medicaljournals.se/acta
- https://doaj.org/toc/ffde9666ab1d46f1a8c688ce6...
- NaN
- 0001-5555
- 1651-2057
- sexually transmitted infections, psoriasis, ps...
- English
- Society for Publication of Acta Dermato-Venere...
- Sweden
- NaN
- NaN
- CC BY-NC
- NaN
- https://www.medicaljournals.se/acta/open-acces...
- NaN
- NaN
- No
- NaN
- Peer review
- https://www.medicaljournals.se/acta/instructio...
- No
- NaN
- http://www.medicaljournals.se/acta
- https://www.medicaljournals.se/acta/editors
- https://www.medicaljournals.se/acta/instructio...
- 20
- Yes
- https://www.medicaljournals.se/acta/instructio...
- 1600 EUR
- No
- NaN
- Yes
- https://www.medicaljournals.se/acta/instructio...
- NaN
- NaN
- http://www.ingentaconnect.com/publisher/claimi...
- Sherpa/Romeo
- NaN
- DOI
- NaN
- NaN
- Yes
- https://www.medicaljournals.se/acta/open-acces...
- NaN
- NaN
- RL1-803
- Medicine: Dermatology
- No
- 2011-11-10T12:31:05Z
- 2017-02-22T11:14:48Z
- 1096
- 2021-03-11T13:41:33Z
-
-
- 3
- Acta Médica Costarricense
- http://actamedica.medicos.cr/index.php/Acta_Me...
- https://doaj.org/toc/a5919aee5ad2413a89cf32df0...
- NaN
- 0001-6012
- 2215-5856
- medicine, public health, medical sciences, health
- English, Spanish
- Colegio de Médicos y Cirujanos de Costa Rica
- Costa Rica
- NaN
- NaN
- CC BY-NC-SA
- NaN
- http://actamedica.medicos.cr/index.php/Acta_Me...
- NaN
- NaN
- No
- http://actamedica.medicos.cr/index.php/Acta_Me...
- Double blind peer review
- http://actamedica.medicos.cr/index.php/Acta_Me...
- Yes
- http://actamedica.medicos.cr/index.php/Acta_Me...
- http://actamedica.medicos.cr/index.php/Acta_Me...
- http://actamedica.medicos.cr/index.php/Acta_Me...
- http://actamedica.medicos.cr/index.php/Acta_Me...
- 12
- No
- http://actamedica.medicos.cr/index.php/Acta_Me...
- NaN
- No
- NaN
- No
- NaN
- PKP PN
- NaN
- http://actamedica.medicos.cr/index.php/Acta_Me...
- Sherpa/Romeo
- http://actamedica.medicos.cr/index.php/Acta_Me...
- NaN
- No
- No
- Yes
- http://actamedica.medicos.cr/index.php/Acta_Me...
- NaN
- NaN
- R
- Medicine
- No
- 2020-12-22T11:08:24Z
- 2020-12-22T11:08:24Z
- 1207
- 2015-12-08T15:06:43Z
-
-
- 4
- Acta Mycologica
- https://pbsociety.org.pl/journals/index.php/am...
- https://doaj.org/toc/0e8e2531ae3f455ebb49acb08...
- NaN
- 0001-625X
- 2353-074X
- mycology, micromycetes, marcomycetes, slime mo...
- English
- Polish Botanical Society
- Poland
- NaN
- NaN
- CC BY
- NaN
- https://pbsociety.org.pl/journals/index.php/am...
- Yes
- https://doi.org/10.5586/am.5511
- Yes
- https://pbsociety.org.pl/journals/index.php/am...
- Double blind peer review
- https://pbsociety.org.pl/journals/index.php/am...
- Yes
- https://pbsociety.org.pl/journals/index.php/am...
- https://pbsociety.org.pl/journals/index.php/am...
- https://pbsociety.org.pl/journals/index.php/am...
- https://pbsociety.org.pl/journals/index.php/am...
- 16
- Yes
- https://pbsociety.org.pl/journals/index.php/am...
- 400 EUR
- No
- NaN
- No
- NaN
- NaN
- NaN
- NaN
- Sherpa/Romeo
- https://v2.sherpa.ac.uk/id/publication/25478
- DOI
- Yes
- Yes
- Yes
- https://pbsociety.org.pl/journals/index.php/am...
- NaN
- NaN
- QH301-705.5
- Science: Biology (General)
- No
- 2014-05-29T20:02:32Z
- 2021-01-16T17:41:32Z
- 1154
- 2021-03-05T18:55:46Z
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- BME Frontiers
- https://spj.sciencemag.org/bmef
- https://doaj.org/toc/f9fa881c1be5443a86ed71c2e...
- Biomedical Engineering Frontiers
- NaN
- 2765-8031
- biomedical imaging, biomedical devices, biomat...
- English
- American Association for the Advancement of Sc...
- United States
- Suzhou Institute of Biomedical Engineering and...
- China
- CC BY
- NaN
- https://spj.sciencemag.org/bmef/guidelines/#co...
- Yes
- https://spj.sciencemag.org/journals/bmef/2020/...
- No
- https://spj.sciencemag.org/bmef/guidelines/#co...
- Blind peer review
- https://spj.sciencemag.org/bmef/peer-review-pr...
- Yes
- https://spj.sciencemag.org/bmef/publication-et...
- https://spj.sciencemag.org/bmef/about/#mission...
- https://spj.sciencemag.org/bmef/editors/
- https://spj.sciencemag.org/bmef/guidelines/
- 16
- No
- https://spj.sciencemag.org/bmef/apc/
- NaN
- Yes
- https://spj.sciencemag.org/bmef/apc/
- No
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- DOI
- Yes
- Yes
- Yes
- https://spj.sciencemag.org/bmef/about/
- NaN
- NaN
- R855-855.5|TP248.13-248.65
- Medicine: Medicine (General): Medical technolo...
- No
- 2021-01-22T11:54:20Z
- 2021-01-22T11:54:20Z
- 11
- 2021-03-08T09:06:36Z
-
-
- 16025
- Harvard Kennedy School Misinformation Review
- https://misinforeview.hks.harvard.edu
- https://doaj.org/toc/d71096ec7090499681cc0ccf8...
- HKS Misinformation Review
- NaN
- 2766-1652
- misinformation, disinformation, fake news
- English
- Harvard Kennedy School
- United States
- NaN
- NaN
- CC BY
- NaN
- https://misinforeview.hks.harvard.edu/editoria...
- Yes
- https://misinforeview.hks.harvard.edu/article/...
- Yes
- https://misinforeview.hks.harvard.edu/editoria...
- Double blind peer review
- https://misinforeview.hks.harvard.edu/editoria...
- No
- NaN
- https://misinforeview.hks.harvard.edu/our-miss...
- https://misinforeview.hks.harvard.edu/editoria...
- https://misinforeview.hks.harvard.edu/submit/
- 10
- No
- https://misinforeview.hks.harvard.edu/editoria...
- NaN
- No
- NaN
- No
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- DOI
- Yes
- No
- Yes
- https://misinforeview.hks.harvard.edu/editoria...
- NaN
- NaN
- T58.5-58.64|P87-96
- Technology: Technology (General): Industrial e...
- No
- 2021-02-12T10:29:21Z
- 2021-02-12T10:29:21Z
- 0
- NaN
-
-
- 16026
- One Health & Risk Management
- https://journal.ohrm.bba.md/index.php/journal-...
- https://doaj.org/toc/68671b966cd24a0ebaa44d78f...
- OH&RM
- 2887-3458
- 2587-3466
- one health, risc management, public health, hu...
- English, Romanian, French, Russian
- Asociatia de Biosiguranta si Biosecuritate
- Moldova, Republic of
- NaN
- NaN
- CC BY
- NaN
- https://journal.ohrm.bba.md/index.php/journal-...
- Yes
- https://journal.ohrm.bba.md/index.php/journal-...
- Yes
- https://journal.ohrm.bba.md/index.php/journal-...
- Double blind peer review
- https://journal.ohrm.bba.md/index.php/journal-...
- No
- NaN
- https://journal.ohrm.bba.md/index.php/journal-...
- https://journal.ohrm.bba.md/index.php/journal-...
- https://journal.ohrm.bba.md/index.php/journal-...
- 10
- No
- https://journal.ohrm.bba.md/index.php/journal-...
- NaN
- No
- NaN
- No
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- DOI, UDC
- Yes
- No
- Yes
- https://journal.ohrm.bba.md/index.php/journal-...
- NaN
- NaN
- R|Q
- Medicine | Science
- No
- 2021-03-04T16:06:58Z
- 2021-03-04T16:06:58Z
- 4
- 2021-03-04T20:46:57Z
-
-
- 16027
- فصلنامه پژوهشهای مدیریت منابع انسانی
- https://hrmj.ihu.ac.ir/?lang=en
- https://doaj.org/toc/87d44ffb6ff849b18d5ddce9c...
- Journal of Research in Human Resources Management
- 8254-8002
- 2645-5072
- human resources management
- Persian
- Imam Hussein University
- Iran, Islamic Republic of
- NaN
- NaN
- CC BY
- NaN
- https://hrmj.ihu.ac.ir/journal/about?lang=en
- NaN
- NaN
- Yes
- https://hrmj.ihu.ac.ir/journal/about?lang=en
- Double blind peer review
- https://hrmj.ihu.ac.ir/journal/process?lang=en
- No
- NaN
- https://hrmj.ihu.ac.ir/journal/aim_scope?lang=en
- https://hrmj.ihu.ac.ir/journal/editorial.board...
- https://hrmj.ihu.ac.ir/journal/authors.note?la...
- 20
- No
- https://hrmj.ihu.ac.ir/?lang=en
- NaN
- No
- NaN
- No
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- No
- No
- Yes
- https://hrmj.ihu.ac.ir/?lang=en
- NaN
- NaN
- HF5549-5549.5
- Social Sciences: Commerce: Business: Personnel...
- No
- 2021-01-20T11:27:05Z
- 2021-01-20T11:27:05Z
- 0
- NaN
-
-
- 16028
- Science of Tsunami Hazards
- http://tsunamisociety.org/
- https://doaj.org/toc/a4f06be11f4f4db489dc034c7...
- NaN
- 8755-6839
- NaN
- tsunamis, tsunami warning systems, earthquakes...
- English
- Tsunami Society International
- United States
- Tsunami Society International
- NaN
- CC BY
- NaN
- http://tsunamisociety.org/InstructionsAuthors....
- NaN
- NaN
- No
- NaN
- Peer review
- http://tsunamisociety.org/PeerReview.html
- No
- NaN
- http://tsunamisociety.org/AboutUs.html
- http://tsunamisociety.org/EditorialBoard.html
- http://tsunamisociety.org/InstructionsAuthors....
- 12
- No
- http://tsunamisociety.org/InstructionsAuthors....
- NaN
- No
- NaN
- Yes
- http://tsunamisociety.org/InstructionsAuthors....
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- Yes
- http://tsunamisociety.org/AboutUs.html
- NaN
- NaN
- GC1-1581
- Geography. Anthropology. Recreation: Oceanography
- No
- 2009-04-16T17:40:30Z
- 2016-07-21T16:09:38Z
- 239
- 2021-02-27T01:00:51Z
-
-
-
-
16029 rows × 53 columns
-
-
-
-
-
-```python
-# garder les lignes avec APC
-doaj_apc = doaj.loc[doaj['APC'] == 'Yes'][['Journal ISSN (print version)', 'Journal EISSN (online version)', 'APC amount']]
-doaj_apc
-```
-
-
-
-
-
-
-
-
-
-
- Journal ISSN (print version)
- Journal EISSN (online version)
- APC amount
-
-
-
-
- 2
- 0001-5555
- 1651-2057
- 1600 EUR
-
-
- 4
- 0001-625X
- 2353-074X
- 400 EUR
-
-
- 5
- 0001-6918
- 1873-6297
- 1500 USD
-
-
- 6
- 0001-6977
- 2083-9480
- 520 EUR
-
-
- 11
- 0003-1062
- 2327-9788
- 3500 USD
-
-
- ...
- ...
- ...
- ...
-
-
- 16002
- NaN
- 2722-1253
- 200 USD
-
-
- 16004
- NaN
- 2722-7235
- 35 USD
-
-
- 16005
- 2722-9688
- 2722-9696
- 500000 IDR
-
-
- 16007
- NaN
- 2723-1097
- 100000 IDR
-
-
- 16022
- 2765-0189
- 2765-0235
- 700 USD
-
-
-
-
4462 rows × 3 columns
-
-
-
-
-
-```python
-# garder les lignes avec APC no
-doaj_apc_no = doaj.loc[doaj['APC'] == 'No'][['Journal ISSN (print version)', 'Journal EISSN (online version)']]
-doaj_apc_no
-```
-
-
-
-
-
-
-
-
-
-
- Journal ISSN (print version)
- Journal EISSN (online version)
-
-
-
-
- 0
- 0001-3765
- 1678-2690
-
-
- 1
- 0001-494X
- 2282-0035
-
-
- 3
- 0001-6012
- 2215-5856
-
-
- 7
- 0001-7019
- 1846-0410
-
-
- 8
- 0002-0397
- 1868-6869
-
-
- ...
- ...
- ...
-
-
- 16024
- NaN
- 2765-8031
-
-
- 16025
- NaN
- 2766-1652
-
-
- 16026
- 2887-3458
- 2587-3466
-
-
- 16027
- 8254-8002
- 2645-5072
-
-
- 16028
- 8755-6839
- NaN
-
-
-
-
11567 rows × 2 columns
-
-
-
-
-
-```python
-# attribuer la valeur 0
-doaj_apc_no['APC amount'] = 0
-doaj_apc_no
-```
-
-
-
-
-
-
-
-
-
-
- Journal ISSN (print version)
- Journal EISSN (online version)
- APC amount
-
-
-
-
- 0
- 0001-3765
- 1678-2690
- 0
-
-
- 1
- 0001-494X
- 2282-0035
- 0
-
-
- 3
- 0001-6012
- 2215-5856
- 0
-
-
- 7
- 0001-7019
- 1846-0410
- 0
-
-
- 8
- 0002-0397
- 1868-6869
- 0
-
-
- ...
- ...
- ...
- ...
-
-
- 16024
- NaN
- 2765-8031
- 0
-
-
- 16025
- NaN
- 2766-1652
- 0
-
-
- 16026
- 2887-3458
- 2587-3466
- 0
-
-
- 16027
- 8254-8002
- 2645-5072
- 0
-
-
- 16028
- 8755-6839
- NaN
- 0
-
-
-
-
11567 rows × 3 columns
-
-
-
-
-
-```python
-# ajout à la table des APC
-doaj_apc = doaj_apc.append(doaj_apc_no, ignore_index=True)
-doaj_apc
-```
-
-
-
-
-
-
-
-
-
-
- Journal ISSN (print version)
- Journal EISSN (online version)
- APC amount
-
-
-
-
- 0
- 0001-5555
- 1651-2057
- 1600 EUR
-
-
- 1
- 0001-625X
- 2353-074X
- 400 EUR
-
-
- 2
- 0001-6918
- 1873-6297
- 1500 USD
-
-
- 3
- 0001-6977
- 2083-9480
- 520 EUR
-
-
- 4
- 0003-1062
- 2327-9788
- 3500 USD
-
-
- ...
- ...
- ...
- ...
-
-
- 16024
- NaN
- 2765-8031
- 0
-
-
- 16025
- NaN
- 2766-1652
- 0
-
-
- 16026
- 2887-3458
- 2587-3466
- 0
-
-
- 16027
- 8254-8002
- 2645-5072
- 0
-
-
- 16028
- 8755-6839
- NaN
- 0
-
-
-
-
16029 rows × 3 columns
-
-
-
-
-
-```python
-# découpage du prix en 'amount' et 'symbol'
-doaj_apc[['amount', 'symbol']] = doaj_apc['APC amount'].str.split(' ', n=1, expand=True)
-doaj_apc
-```
-
-
-
-
-
-
-
-
-
-
- Journal ISSN (print version)
- Journal EISSN (online version)
- APC amount
- amount
- symbol
-
-
-
-
- 0
- 0001-5555
- 1651-2057
- 1600 EUR
- 1600
- EUR
-
-
- 1
- 0001-625X
- 2353-074X
- 400 EUR
- 400
- EUR
-
-
- 2
- 0001-6918
- 1873-6297
- 1500 USD
- 1500
- USD
-
-
- 3
- 0001-6977
- 2083-9480
- 520 EUR
- 520
- EUR
-
-
- 4
- 0003-1062
- 2327-9788
- 3500 USD
- 3500
- USD
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- NaN
- 2765-8031
- 0
- NaN
- NaN
-
-
- 16025
- NaN
- 2766-1652
- 0
- NaN
- NaN
-
-
- 16026
- 2887-3458
- 2587-3466
- 0
- NaN
- NaN
-
-
- 16027
- 8254-8002
- 2645-5072
- 0
- NaN
- NaN
-
-
- 16028
- 8755-6839
- NaN
- 0
- NaN
- NaN
-
-
-
-
16029 rows × 5 columns
-
-
-
-
-
-```python
-doaj_apc.loc[doaj_apc['APC amount'] == 0, 'amount'] = 0
-doaj_apc.loc[doaj_apc['APC amount'] == 0, 'symbol'] = ''
-doaj_apc
-```
-
-
-
-
-
-
-
-
-
-
- Journal ISSN (print version)
- Journal EISSN (online version)
- APC amount
- amount
- symbol
-
-
-
-
- 0
- 0001-5555
- 1651-2057
- 1600 EUR
- 1600
- EUR
-
-
- 1
- 0001-625X
- 2353-074X
- 400 EUR
- 400
- EUR
-
-
- 2
- 0001-6918
- 1873-6297
- 1500 USD
- 1500
- USD
-
-
- 3
- 0001-6977
- 2083-9480
- 520 EUR
- 520
- EUR
-
-
- 4
- 0003-1062
- 2327-9788
- 3500 USD
- 3500
- USD
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- NaN
- 2765-8031
- 0
- 0
-
-
-
- 16025
- NaN
- 2766-1652
- 0
- 0
-
-
-
- 16026
- 2887-3458
- 2587-3466
- 0
- 0
-
-
-
- 16027
- 8254-8002
- 2645-5072
- 0
- 0
-
-
-
- 16028
- 8755-6839
- NaN
- 0
- 0
-
-
-
-
-
16029 rows × 5 columns
-
-
-
-
-
-```python
-# ajouter les champs manquants
-doaj_apc['cost_factor_type'] = 1
-doaj_apc['comment'] = 'Source: DOAJ'
-doaj_apc
-```
-
-
-
-
-
-
-
-
-
-
- Journal ISSN (print version)
- Journal EISSN (online version)
- APC amount
- amount
- symbol
- cost_factor_type
- comment
-
-
-
-
- 0
- 0001-5555
- 1651-2057
- 1600 EUR
- 1600
- EUR
- 1
- Source: DOAJ
-
-
- 1
- 0001-625X
- 2353-074X
- 400 EUR
- 400
- EUR
- 1
- Source: DOAJ
-
-
- 2
- 0001-6918
- 1873-6297
- 1500 USD
- 1500
- USD
- 1
- Source: DOAJ
-
-
- 3
- 0001-6977
- 2083-9480
- 520 EUR
- 520
- EUR
- 1
- Source: DOAJ
-
-
- 4
- 0003-1062
- 2327-9788
- 3500 USD
- 3500
- USD
- 1
- Source: DOAJ
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- NaN
- 2765-8031
- 0
- 0
-
- 1
- Source: DOAJ
-
-
- 16025
- NaN
- 2766-1652
- 0
- 0
-
- 1
- Source: DOAJ
-
-
- 16026
- 2887-3458
- 2587-3466
- 0
- 0
-
- 1
- Source: DOAJ
-
-
- 16027
- 8254-8002
- 2645-5072
- 0
- 0
-
- 1
- Source: DOAJ
-
-
- 16028
- 8755-6839
- NaN
- 0
- 0
-
- 1
- Source: DOAJ
-
-
-
-
16029 rows × 7 columns
-
-
-
-
-
-```python
-# renommer les champs
-doaj_apc = doaj_apc.rename(columns = {'Journal ISSN (print version)' : 'issn_print', 'Journal EISSN (online version)' : 'issn_electronic'})
-doaj_apc
-```
-
-
-
-
-
-
-
-
-
-
- issn_print
- issn_electronic
- APC amount
- amount
- symbol
- cost_factor_type
- comment
-
-
-
-
- 0
- 0001-5555
- 1651-2057
- 1600 EUR
- 1600
- EUR
- 1
- Source: DOAJ
-
-
- 1
- 0001-625X
- 2353-074X
- 400 EUR
- 400
- EUR
- 1
- Source: DOAJ
-
-
- 2
- 0001-6918
- 1873-6297
- 1500 USD
- 1500
- USD
- 1
- Source: DOAJ
-
-
- 3
- 0001-6977
- 2083-9480
- 520 EUR
- 520
- EUR
- 1
- Source: DOAJ
-
-
- 4
- 0003-1062
- 2327-9788
- 3500 USD
- 3500
- USD
- 1
- Source: DOAJ
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- NaN
- 2765-8031
- 0
- 0
-
- 1
- Source: DOAJ
-
-
- 16025
- NaN
- 2766-1652
- 0
- 0
-
- 1
- Source: DOAJ
-
-
- 16026
- 2887-3458
- 2587-3466
- 0
- 0
-
- 1
- Source: DOAJ
-
-
- 16027
- 8254-8002
- 2645-5072
- 0
- 0
-
- 1
- Source: DOAJ
-
-
- 16028
- 8755-6839
- NaN
- 0
- 0
-
- 1
- Source: DOAJ
-
-
-
-
16029 rows × 7 columns
-
-
-
-
-
-```python
-# ajout du issn
-doaj_apc['issn'] = doaj_apc['issn_electronic']
-doaj_apc
-```
-
-
-
-
-
-
-
-
-
-
- issn_print
- issn_electronic
- APC amount
- amount
- symbol
- cost_factor_type
- comment
- issn
-
-
-
-
- 0
- 0001-5555
- 1651-2057
- 1600 EUR
- 1600
- EUR
- 1
- Source: DOAJ
- 1651-2057
-
-
- 1
- 0001-625X
- 2353-074X
- 400 EUR
- 400
- EUR
- 1
- Source: DOAJ
- 2353-074X
-
-
- 2
- 0001-6918
- 1873-6297
- 1500 USD
- 1500
- USD
- 1
- Source: DOAJ
- 1873-6297
-
-
- 3
- 0001-6977
- 2083-9480
- 520 EUR
- 520
- EUR
- 1
- Source: DOAJ
- 2083-9480
-
-
- 4
- 0003-1062
- 2327-9788
- 3500 USD
- 3500
- USD
- 1
- Source: DOAJ
- 2327-9788
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- NaN
- 2765-8031
- 0
- 0
-
- 1
- Source: DOAJ
- 2765-8031
-
-
- 16025
- NaN
- 2766-1652
- 0
- 0
-
- 1
- Source: DOAJ
- 2766-1652
-
-
- 16026
- 2887-3458
- 2587-3466
- 0
- 0
-
- 1
- Source: DOAJ
- 2587-3466
-
-
- 16027
- 8254-8002
- 2645-5072
- 0
- 0
-
- 1
- Source: DOAJ
- 2645-5072
-
-
- 16028
- 8755-6839
- NaN
- 0
- 0
-
- 1
- Source: DOAJ
- NaN
-
-
-
-
16029 rows × 8 columns
-
-
-
-
-
-```python
-doaj_apc.loc[doaj_apc['issn'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- issn_print
- issn_electronic
- APC amount
- amount
- symbol
- cost_factor_type
- comment
- issn
-
-
-
-
- 12
- 0013-9998
- NaN
- 350 EUR
- 350
- EUR
- 1
- Source: DOAJ
- NaN
-
-
- 14
- 0015-4040
- NaN
- 747 USD
- 747
- USD
- 1
- Source: DOAJ
- NaN
-
-
- 17
- 0017-0011
- NaN
- 400 EUR
- 400
- EUR
- 1
- Source: DOAJ
- NaN
-
-
- 29
- 0026-1165
- NaN
- 220000 JPY
- 220000
- JPY
- 1
- Source: DOAJ
- NaN
-
-
- 30
- 0026-279X
- NaN
- 350 USD
- 350
- USD
- 1
- Source: DOAJ
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 15867
- 2676-5357
- NaN
- 0
- 0
-
- 1
- Source: DOAJ
- NaN
-
-
- 15892
- 2686-9594
- NaN
- 0
- 0
-
- 1
- Source: DOAJ
- NaN
-
-
- 15937
- 2701-1569
- NaN
- 0
- 0
-
- 1
- Source: DOAJ
- NaN
-
-
- 15974
- 2709-8370
- NaN
- 0
- 0
-
- 1
- Source: DOAJ
- NaN
-
-
- 16028
- 8755-6839
- NaN
- 0
- 0
-
- 1
- Source: DOAJ
- NaN
-
-
-
-
1461 rows × 8 columns
-
-
-
-
-
-```python
-# ajout du issnp quand c'est vide
-doaj_apc.loc[doaj_apc['issn'].isna(), 'issn'] = doaj_apc['issn_print']
-doaj_apc.loc[doaj_apc['issn'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- issn_print
- issn_electronic
- APC amount
- amount
- symbol
- cost_factor_type
- comment
- issn
-
-
-
-
-
-
-
-
-
-
-```python
-doaj_apc = pd.merge(doaj_apc, issns, on='issn', how='left')
-doaj_apc
-```
-
-
-
-
-
-
-
-
-
-
- issn_print
- issn_electronic
- APC amount
- amount
- symbol
- cost_factor_type
- comment
- issn
- issnl
-
-
-
-
- 0
- 0001-5555
- 1651-2057
- 1600 EUR
- 1600
- EUR
- 1
- Source: DOAJ
- 1651-2057
- 0001-5555
-
-
- 1
- 0001-625X
- 2353-074X
- 400 EUR
- 400
- EUR
- 1
- Source: DOAJ
- 2353-074X
- 0001-625X
-
-
- 2
- 0001-6918
- 1873-6297
- 1500 USD
- 1500
- USD
- 1
- Source: DOAJ
- 1873-6297
- 0001-6918
-
-
- 3
- 0001-6977
- 2083-9480
- 520 EUR
- 520
- EUR
- 1
- Source: DOAJ
- 2083-9480
- 0001-6977
-
-
- 4
- 0003-1062
- 2327-9788
- 3500 USD
- 3500
- USD
- 1
- Source: DOAJ
- 2327-9788
- 0003-1062
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- NaN
- 2765-8031
- 0
- 0
-
- 1
- Source: DOAJ
- 2765-8031
- NaN
-
-
- 16025
- NaN
- 2766-1652
- 0
- 0
-
- 1
- Source: DOAJ
- 2766-1652
- NaN
-
-
- 16026
- 2887-3458
- 2587-3466
- 0
- 0
-
- 1
- Source: DOAJ
- 2587-3466
- NaN
-
-
- 16027
- 8254-8002
- 2645-5072
- 0
- 0
-
- 1
- Source: DOAJ
- 2645-5072
- NaN
-
-
- 16028
- 8755-6839
- NaN
- 0
- 0
-
- 1
- Source: DOAJ
- 8755-6839
- 8755-6839
-
-
-
-
16029 rows × 9 columns
-
-
-
-
-
-```python
-# renommer les colonnes
-doaj_apc = doaj_apc.rename(columns={'issnl' : 'issn_link'})
-doaj_apc
-```
-
-
-
-
-
-
-
-
-
-
- issn_print
- issn_electronic
- APC amount
- amount
- symbol
- cost_factor_type
- comment
- issn
- issn_link
-
-
-
-
- 0
- 0001-5555
- 1651-2057
- 1600 EUR
- 1600
- EUR
- 1
- Source: DOAJ
- 1651-2057
- 0001-5555
-
-
- 1
- 0001-625X
- 2353-074X
- 400 EUR
- 400
- EUR
- 1
- Source: DOAJ
- 2353-074X
- 0001-625X
-
-
- 2
- 0001-6918
- 1873-6297
- 1500 USD
- 1500
- USD
- 1
- Source: DOAJ
- 1873-6297
- 0001-6918
-
-
- 3
- 0001-6977
- 2083-9480
- 520 EUR
- 520
- EUR
- 1
- Source: DOAJ
- 2083-9480
- 0001-6977
-
-
- 4
- 0003-1062
- 2327-9788
- 3500 USD
- 3500
- USD
- 1
- Source: DOAJ
- 2327-9788
- 0003-1062
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 16024
- NaN
- 2765-8031
- 0
- 0
-
- 1
- Source: DOAJ
- 2765-8031
- NaN
-
-
- 16025
- NaN
- 2766-1652
- 0
- 0
-
- 1
- Source: DOAJ
- 2766-1652
- NaN
-
-
- 16026
- 2887-3458
- 2587-3466
- 0
- 0
-
- 1
- Source: DOAJ
- 2587-3466
- NaN
-
-
- 16027
- 8254-8002
- 2645-5072
- 0
- 0
-
- 1
- Source: DOAJ
- 2645-5072
- NaN
-
-
- 16028
- 8755-6839
- NaN
- 0
- 0
-
- 1
- Source: DOAJ
- 8755-6839
- 8755-6839
-
-
-
-
16029 rows × 9 columns
-
-
-
-
-### Ajout des APCs depuis la base Journal Database (Zurich Open Repository and Archive)
-
-https://www.jdb.uzh.ch/
-
-
-```python
-# JDB base de Zurich
-jdb = pd.read_csv('zora/jdb_apcs.tsv', encoding='utf-8', header=0, sep='\t')
-jdb
-```
-
-
-
-
-
-
-
-
-
-
- id
- issn_print
- issn_electronic
- issn_link
- apc_fee
- apc_currency
- apc_date
-
-
-
-
- 0
- 10001
- 1662-5161
- 1662-5161
- 1662-5161
- 2490
- USD
- 2018
-
-
- 1
- 10001
- 1662-5161
- 1662-5161
- 1662-5161
- 2950
- USD
- 2020
-
-
- 2
- 10002
- 0952-3383
- 1467-8578
- 0952-3383
- 2500
- EUR
- 2017
-
-
- 3
- 10005
- 1179-7258
- 1179-7258
- 1179-7258
- 1958
- USD
- 2018
-
-
- 4
- 10005
- 1179-7258
- 1179-7258
- 1179-7258
- 1958
- USD
- 2020
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 11575
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3000
- USD
- 2015
-
-
- 11576
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3550
- USD
- 2016
-
-
- 11577
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3550
- USD
- 2017
-
-
- 11578
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3750
- USD
- 2018
-
-
- 11579
- 9995
- 0816-4649
- 1465-3303
- 0816-4649
- 2950
- USD
- 2017
-
-
-
-
11580 rows × 7 columns
-
-
-
-
-
-```python
-# renommer l'id
-jdb = jdb.rename(columns = {'id' : 'jdb_id'})
-jdb
-```
-
-
-
-
-
-
-
-
-
-
- jdb_id
- issn_print
- issn_electronic
- issn_link
- apc_fee
- apc_currency
- apc_date
-
-
-
-
- 0
- 10001
- 1662-5161
- 1662-5161
- 1662-5161
- 2490
- USD
- 2018
-
-
- 1
- 10001
- 1662-5161
- 1662-5161
- 1662-5161
- 2950
- USD
- 2020
-
-
- 2
- 10002
- 0952-3383
- 1467-8578
- 0952-3383
- 2500
- EUR
- 2017
-
-
- 3
- 10005
- 1179-7258
- 1179-7258
- 1179-7258
- 1958
- USD
- 2018
-
-
- 4
- 10005
- 1179-7258
- 1179-7258
- 1179-7258
- 1958
- USD
- 2020
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 11575
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3000
- USD
- 2015
-
-
- 11576
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3550
- USD
- 2016
-
-
- 11577
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3550
- USD
- 2017
-
-
- 11578
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3750
- USD
- 2018
-
-
- 11579
- 9995
- 0816-4649
- 1465-3303
- 0816-4649
- 2950
- USD
- 2017
-
-
-
-
11580 rows × 7 columns
-
-
-
-
-
-```python
-# ajouter les champs manquants
-jdb['cost_factor_type'] = 1
-jdb['comment'] = 'Source: JDB (' + jdb['apc_date'].astype(str) + ')'
-jdb
-```
-
-
-
-
-
-
-
-
-
-
- jdb_id
- issn_print
- issn_electronic
- issn_link
- apc_fee
- apc_currency
- apc_date
- cost_factor_type
- comment
-
-
-
-
- 0
- 10001
- 1662-5161
- 1662-5161
- 1662-5161
- 2490
- USD
- 2018
- 1
- Source: JDB (2018)
-
-
- 1
- 10001
- 1662-5161
- 1662-5161
- 1662-5161
- 2950
- USD
- 2020
- 1
- Source: JDB (2020)
-
-
- 2
- 10002
- 0952-3383
- 1467-8578
- 0952-3383
- 2500
- EUR
- 2017
- 1
- Source: JDB (2017)
-
-
- 3
- 10005
- 1179-7258
- 1179-7258
- 1179-7258
- 1958
- USD
- 2018
- 1
- Source: JDB (2018)
-
-
- 4
- 10005
- 1179-7258
- 1179-7258
- 1179-7258
- 1958
- USD
- 2020
- 1
- Source: JDB (2020)
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 11575
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3000
- USD
- 2015
- 1
- Source: JDB (2015)
-
-
- 11576
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3550
- USD
- 2016
- 1
- Source: JDB (2016)
-
-
- 11577
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3550
- USD
- 2017
- 1
- Source: JDB (2017)
-
-
- 11578
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3750
- USD
- 2018
- 1
- Source: JDB (2018)
-
-
- 11579
- 9995
- 0816-4649
- 1465-3303
- 0816-4649
- 2950
- USD
- 2017
- 1
- Source: JDB (2017)
-
-
-
-
11580 rows × 9 columns
-
-
-
-
-
-```python
-# renommer les champs
-jdb = jdb.rename(columns = {'apc_fee' : 'amount', 'apc_currency' : 'symbol'})
-jdb
-```
-
-
-
-
-
-
-
-
-
-
- jdb_id
- issn_print
- issn_electronic
- issn_link
- amount
- symbol
- apc_date
- cost_factor_type
- comment
-
-
-
-
- 0
- 10001
- 1662-5161
- 1662-5161
- 1662-5161
- 2490
- USD
- 2018
- 1
- Source: JDB (2018)
-
-
- 1
- 10001
- 1662-5161
- 1662-5161
- 1662-5161
- 2950
- USD
- 2020
- 1
- Source: JDB (2020)
-
-
- 2
- 10002
- 0952-3383
- 1467-8578
- 0952-3383
- 2500
- EUR
- 2017
- 1
- Source: JDB (2017)
-
-
- 3
- 10005
- 1179-7258
- 1179-7258
- 1179-7258
- 1958
- USD
- 2018
- 1
- Source: JDB (2018)
-
-
- 4
- 10005
- 1179-7258
- 1179-7258
- 1179-7258
- 1958
- USD
- 2020
- 1
- Source: JDB (2020)
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 11575
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3000
- USD
- 2015
- 1
- Source: JDB (2015)
-
-
- 11576
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3550
- USD
- 2016
- 1
- Source: JDB (2016)
-
-
- 11577
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3550
- USD
- 2017
- 1
- Source: JDB (2017)
-
-
- 11578
- 9986
- 1549-9634
- 1549-9642
- 1549-9634
- 3750
- USD
- 2018
- 1
- Source: JDB (2018)
-
-
- 11579
- 9995
- 0816-4649
- 1465-3303
- 0816-4649
- 2950
- USD
- 2017
- 1
- Source: JDB (2017)
-
-
-
-
11580 rows × 9 columns
-
-
-
-
-
-```python
-jdb = jdb.drop_duplicates(subset='jdb_id', keep='last')
-```
-
-
-```python
-# import openapc avec les valeurs max
-openapc = pd.read_csv('openapc/open_apc_max.tsv', encoding='utf-8', header=0, sep='\t')
-openapc
-```
-
-
-
-
-
-
-
-
-
-
- period
- euro
- issn
- issn_print
- issn_electronic
- issn_l
-
-
-
-
- 0
- 2018
- 1385.36
- 0001-0782
- 0001-0782
- NaN
- 0001-0782
-
-
- 1
- 2018
- 1811.88
- 0001-1452
- 0001-1452
- 1533-385X
- 0001-1452
-
-
- 2
- 2020
- 1826.49
- 0001-1452
- 0001-1452
- 1533-385X
- 0001-1452
-
-
- 3
- 2013
- 2238.76
- 0001-1541
- NaN
- NaN
- 0001-1541
-
-
- 4
- 2014
- 1887.86
- 0001-1541
- NaN
- NaN
- 0001-1541
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 23793
- 2013
- 2400.00
- 8756-7938
- NaN
- NaN
- 1520-6033
-
-
- 23794
- 2014
- 1822.49
- 8756-7938
- NaN
- NaN
- 1520-6033
-
-
- 23795
- 2016
- 1762.69
- 8756-7938
- NaN
- NaN
- 1520-6033
-
-
- 23796
- 2017
- 3248.31
- 8756-7938
- NaN
- NaN
- 1520-6033
-
-
- 23797
- 2019
- 2913.11
- 8756-7938
- NaN
- NaN
- 1520-6033
-
-
-
-
23798 rows × 6 columns
-
-
-
-
-
-```python
-# renommer les champs
-openapc = openapc.rename(columns = {'period' : 'apc_date', 'issn_l' : 'issn_link', 'euro' : 'amount'})
-openapc
-```
-
-
-
-
-
-
-
-
-
-
- apc_date
- amount
- issn
- issn_print
- issn_electronic
- issn_link
-
-
-
-
- 0
- 2018
- 1385.36
- 0001-0782
- 0001-0782
- NaN
- 0001-0782
-
-
- 1
- 2018
- 1811.88
- 0001-1452
- 0001-1452
- 1533-385X
- 0001-1452
-
-
- 2
- 2020
- 1826.49
- 0001-1452
- 0001-1452
- 1533-385X
- 0001-1452
-
-
- 3
- 2013
- 2238.76
- 0001-1541
- NaN
- NaN
- 0001-1541
-
-
- 4
- 2014
- 1887.86
- 0001-1541
- NaN
- NaN
- 0001-1541
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 23793
- 2013
- 2400.00
- 8756-7938
- NaN
- NaN
- 1520-6033
-
-
- 23794
- 2014
- 1822.49
- 8756-7938
- NaN
- NaN
- 1520-6033
-
-
- 23795
- 2016
- 1762.69
- 8756-7938
- NaN
- NaN
- 1520-6033
-
-
- 23796
- 2017
- 3248.31
- 8756-7938
- NaN
- NaN
- 1520-6033
-
-
- 23797
- 2019
- 2913.11
- 8756-7938
- NaN
- NaN
- 1520-6033
-
-
-
-
23798 rows × 6 columns
-
-
-
-
-
-```python
-# ajouter le lien avec le type et le symbole
-openapc['cost_factor_type'] = 1
-openapc['jdb_id'] = np.nan
-openapc['symbol'] = 'EUR'
-openapc['comment'] = 'Source: OpenAPC (' + openapc['apc_date'].astype(str) + ')'
-openapc
-```
-
-
-
-
-
-
-
-
-
-
- apc_date
- amount
- issn
- issn_print
- issn_electronic
- issn_link
- cost_factor_type
- jdb_id
- symbol
- comment
-
-
-
-
- 0
- 2018
- 1385.36
- 0001-0782
- 0001-0782
- NaN
- 0001-0782
- 1
- NaN
- EUR
- Source: OpenAPC (2018)
-
-
- 1
- 2018
- 1811.88
- 0001-1452
- 0001-1452
- 1533-385X
- 0001-1452
- 1
- NaN
- EUR
- Source: OpenAPC (2018)
-
-
- 2
- 2020
- 1826.49
- 0001-1452
- 0001-1452
- 1533-385X
- 0001-1452
- 1
- NaN
- EUR
- Source: OpenAPC (2020)
-
-
- 3
- 2013
- 2238.76
- 0001-1541
- NaN
- NaN
- 0001-1541
- 1
- NaN
- EUR
- Source: OpenAPC (2013)
-
-
- 4
- 2014
- 1887.86
- 0001-1541
- NaN
- NaN
- 0001-1541
- 1
- NaN
- EUR
- Source: OpenAPC (2014)
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 23793
- 2013
- 2400.00
- 8756-7938
- NaN
- NaN
- 1520-6033
- 1
- NaN
- EUR
- Source: OpenAPC (2013)
-
-
- 23794
- 2014
- 1822.49
- 8756-7938
- NaN
- NaN
- 1520-6033
- 1
- NaN
- EUR
- Source: OpenAPC (2014)
-
-
- 23795
- 2016
- 1762.69
- 8756-7938
- NaN
- NaN
- 1520-6033
- 1
- NaN
- EUR
- Source: OpenAPC (2016)
-
-
- 23796
- 2017
- 3248.31
- 8756-7938
- NaN
- NaN
- 1520-6033
- 1
- NaN
- EUR
- Source: OpenAPC (2017)
-
-
- 23797
- 2019
- 2913.11
- 8756-7938
- NaN
- NaN
- 1520-6033
- 1
- NaN
- EUR
- Source: OpenAPC (2019)
-
-
-
-
23798 rows × 10 columns
-
-
-
-
-
-```python
-# ajout des lignes de openapc
-jdb = jdb.append(openapc, ignore_index=True)
-jdb
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py:7123: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
- of pandas will change to not sort by default.
-
- To accept the future behavior, pass 'sort=False'.
-
- To retain the current behavior and silence the warning, pass 'sort=True'.
-
- sort=sort,
-
-
-
-
-
-
-
-
-
-
-
- amount
- apc_date
- comment
- cost_factor_type
- issn
- issn_electronic
- issn_link
- issn_print
- jdb_id
- symbol
-
-
-
-
- 0
- 2950.00
- 2020
- Source: JDB (2020)
- 1
- NaN
- 1662-5161
- 1662-5161
- 1662-5161
- 10001.0
- USD
-
-
- 1
- 2500.00
- 2017
- Source: JDB (2017)
- 1
- NaN
- 1467-8578
- 0952-3383
- 0952-3383
- 10002.0
- EUR
-
-
- 2
- 1958.00
- 2020
- Source: JDB (2020)
- 1
- NaN
- 1179-7258
- 1179-7258
- 1179-7258
- 10005.0
- USD
-
-
- 3
- 1370.00
- 2020
- Source: JDB (2020)
- 1
- NaN
- 1479-5876
- 1479-5876
- NaN
- 10015.0
- GBP
-
-
- 4
- 2200.00
- 2017
- Source: JDB (2017)
- 1
- NaN
- 1572-8552
- 1383-4924
- 1383-4924
- 10023.0
- EUR
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 29947
- 2400.00
- 2013
- Source: OpenAPC (2013)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 29948
- 1822.49
- 2014
- Source: OpenAPC (2014)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 29949
- 1762.69
- 2016
- Source: OpenAPC (2016)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 29950
- 3248.31
- 2017
- Source: OpenAPC (2017)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 29951
- 2913.11
- 2019
- Source: OpenAPC (2019)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
-
-
29952 rows × 10 columns
-
-
-
-
-
-```python
-# supprimer les doublons par issnl et date
-jdb = jdb.drop_duplicates(subset=['issn_link', 'apc_date'], keep='first')
-jdb
-```
-
-
-
-
-
-
-
-
-
-
- amount
- apc_date
- comment
- cost_factor_type
- issn
- issn_electronic
- issn_link
- issn_print
- jdb_id
- symbol
-
-
-
-
- 0
- 2950.00
- 2020
- Source: JDB (2020)
- 1
- NaN
- 1662-5161
- 1662-5161
- 1662-5161
- 10001.0
- USD
-
-
- 1
- 2500.00
- 2017
- Source: JDB (2017)
- 1
- NaN
- 1467-8578
- 0952-3383
- 0952-3383
- 10002.0
- EUR
-
-
- 2
- 1958.00
- 2020
- Source: JDB (2020)
- 1
- NaN
- 1179-7258
- 1179-7258
- 1179-7258
- 10005.0
- USD
-
-
- 3
- 1370.00
- 2020
- Source: JDB (2020)
- 1
- NaN
- 1479-5876
- 1479-5876
- NaN
- 10015.0
- GBP
-
-
- 4
- 2200.00
- 2017
- Source: JDB (2017)
- 1
- NaN
- 1572-8552
- 1383-4924
- 1383-4924
- 10023.0
- EUR
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 29947
- 2400.00
- 2013
- Source: OpenAPC (2013)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 29948
- 1822.49
- 2014
- Source: OpenAPC (2014)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 29949
- 1762.69
- 2016
- Source: OpenAPC (2016)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 29950
- 3248.31
- 2017
- Source: OpenAPC (2017)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 29951
- 2913.11
- 2019
- Source: OpenAPC (2019)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
-
-
29478 rows × 10 columns
-
-
-
-
-
-```python
-# ajout de DOAJ
-cost_factor = doaj_apc.append(jdb, ignore_index=True)
-cost_factor
-```
-
-
-
-
-
-
-
-
-
-
- APC amount
- amount
- apc_date
- comment
- cost_factor_type
- issn
- issn_electronic
- issn_link
- issn_print
- jdb_id
- symbol
-
-
-
-
- 0
- 1600 EUR
- 1600
- NaN
- Source: DOAJ
- 1
- 1651-2057
- 1651-2057
- 0001-5555
- 0001-5555
- NaN
- EUR
-
-
- 1
- 400 EUR
- 400
- NaN
- Source: DOAJ
- 1
- 2353-074X
- 2353-074X
- 0001-625X
- 0001-625X
- NaN
- EUR
-
-
- 2
- 1500 USD
- 1500
- NaN
- Source: DOAJ
- 1
- 1873-6297
- 1873-6297
- 0001-6918
- 0001-6918
- NaN
- USD
-
-
- 3
- 520 EUR
- 520
- NaN
- Source: DOAJ
- 1
- 2083-9480
- 2083-9480
- 0001-6977
- 0001-6977
- NaN
- EUR
-
-
- 4
- 3500 USD
- 3500
- NaN
- Source: DOAJ
- 1
- 2327-9788
- 2327-9788
- 0003-1062
- 0003-1062
- NaN
- USD
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 45502
- NaN
- 2400
- 2013
- Source: OpenAPC (2013)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 45503
- NaN
- 1822.49
- 2014
- Source: OpenAPC (2014)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 45504
- NaN
- 1762.69
- 2016
- Source: OpenAPC (2016)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 45505
- NaN
- 3248.31
- 2017
- Source: OpenAPC (2017)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
- 45506
- NaN
- 2913.11
- 2019
- Source: OpenAPC (2019)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
-
-
-
-
45507 rows × 11 columns
-
-
-
-
-
-```python
-# test issnl
-cost_factor.loc[cost_factor['issn_link'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- APC amount
- amount
- apc_date
- comment
- cost_factor_type
- issn
- issn_electronic
- issn_link
- issn_print
- jdb_id
- symbol
-
-
-
-
- 13
- 540 PLN
- 540
- NaN
- Source: DOAJ
- 1
- 2544-8552
- 2544-8552
- NaN
- 0014-8261
- NaN
- PLN
-
-
- 62
- 100 USD
- 100
- NaN
- Source: DOAJ
- 1
- 2545-3149
- 2545-3149
- NaN
- 0079-4252
- NaN
- USD
-
-
- 129
- 423 EUR
- 423
- NaN
- Source: DOAJ
- 1
- 2605-3322
- 2605-3322
- NaN
- 0212-9426
- NaN
- EUR
-
-
- 133
- 200 EUR
- 200
- NaN
- Source: DOAJ
- 1
- 2603-5987
- 2603-5987
- NaN
- 0214-9877
- NaN
- EUR
-
-
- 140
- 800000 IDR
- 800000
- NaN
- Source: DOAJ
- 1
- 2621-1122
- 2621-1122
- NaN
- 0216-3438
- NaN
- IDR
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 26703
- NaN
- 3873.61
- 2016
- Source: OpenAPC (2016)
- 1
- 0263-8762
- NaN
- NaN
- 0263-8762
- NaN
- EUR
-
-
- 26704
- NaN
- 2557.73
- 2017
- Source: OpenAPC (2017)
- 1
- 0263-8762
- NaN
- NaN
- 0263-8762
- NaN
- EUR
-
-
- 26705
- NaN
- 3564.25
- 2018
- Source: OpenAPC (2018)
- 1
- 0263-8762
- NaN
- NaN
- 0263-8762
- NaN
- EUR
-
-
- 27923
- NaN
- 1130.5
- 2019
- Source: OpenAPC (2019)
- 1
- 0342-183X
- NaN
- NaN
- 0342-183X
- NaN
- EUR
-
-
- 45474
- NaN
- 1690
- 2020
- Source: OpenAPC (2020)
- 1
- 2691-9478
- NaN
- NaN
- NaN
- NaN
- EUR
-
-
-
-
2500 rows × 11 columns
-
-
-
-
-
-```python
-# merge avec issnl
-cost_factor = pd.merge(cost_factor, issns, on='issn', how='left')
-cost_factor
-```
-
-
-
-
-
-
-
-
-
-
- APC amount
- amount
- apc_date
- comment
- cost_factor_type
- issn
- issn_electronic
- issn_link
- issn_print
- jdb_id
- symbol
- issnl
-
-
-
-
- 0
- 1600 EUR
- 1600
- NaN
- Source: DOAJ
- 1
- 1651-2057
- 1651-2057
- 0001-5555
- 0001-5555
- NaN
- EUR
- 0001-5555
-
-
- 1
- 400 EUR
- 400
- NaN
- Source: DOAJ
- 1
- 2353-074X
- 2353-074X
- 0001-625X
- 0001-625X
- NaN
- EUR
- 0001-625X
-
-
- 2
- 1500 USD
- 1500
- NaN
- Source: DOAJ
- 1
- 1873-6297
- 1873-6297
- 0001-6918
- 0001-6918
- NaN
- USD
- 0001-6918
-
-
- 3
- 520 EUR
- 520
- NaN
- Source: DOAJ
- 1
- 2083-9480
- 2083-9480
- 0001-6977
- 0001-6977
- NaN
- EUR
- 0001-6977
-
-
- 4
- 3500 USD
- 3500
- NaN
- Source: DOAJ
- 1
- 2327-9788
- 2327-9788
- 0003-1062
- 0003-1062
- NaN
- USD
- 0003-1062
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 45502
- NaN
- 2400
- 2013
- Source: OpenAPC (2013)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
- 1520-6033
-
-
- 45503
- NaN
- 1822.49
- 2014
- Source: OpenAPC (2014)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
- 1520-6033
-
-
- 45504
- NaN
- 1762.69
- 2016
- Source: OpenAPC (2016)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
- 1520-6033
-
-
- 45505
- NaN
- 3248.31
- 2017
- Source: OpenAPC (2017)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
- 1520-6033
-
-
- 45506
- NaN
- 2913.11
- 2019
- Source: OpenAPC (2019)
- 1
- 8756-7938
- NaN
- 1520-6033
- NaN
- NaN
- EUR
- 1520-6033
-
-
-
-
45507 rows × 12 columns
-
-
-
-
-
-```python
-# test issnl
-cost_factor.loc[cost_factor['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- APC amount
- amount
- apc_date
- comment
- cost_factor_type
- issn
- issn_electronic
- issn_link
- issn_print
- jdb_id
- symbol
- issnl
-
-
-
-
- 13
- 540 PLN
- 540
- NaN
- Source: DOAJ
- 1
- 2544-8552
- 2544-8552
- NaN
- 0014-8261
- NaN
- PLN
- NaN
-
-
- 62
- 100 USD
- 100
- NaN
- Source: DOAJ
- 1
- 2545-3149
- 2545-3149
- NaN
- 0079-4252
- NaN
- USD
- NaN
-
-
- 129
- 423 EUR
- 423
- NaN
- Source: DOAJ
- 1
- 2605-3322
- 2605-3322
- NaN
- 0212-9426
- NaN
- EUR
- NaN
-
-
- 133
- 200 EUR
- 200
- NaN
- Source: DOAJ
- 1
- 2603-5987
- 2603-5987
- NaN
- 0214-9877
- NaN
- EUR
- NaN
-
-
- 140
- 800000 IDR
- 800000
- NaN
- Source: DOAJ
- 1
- 2621-1122
- 2621-1122
- NaN
- 0216-3438
- NaN
- IDR
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 45472
- NaN
- 698.65
- 2019
- Source: OpenAPC (2019)
- 1
- 2690-0009
- 2690-0009
- 2690-0009
- NaN
- NaN
- EUR
- NaN
-
-
- 45473
- NaN
- 754.67
- 2019
- Source: OpenAPC (2019)
- 1
- 2690-3202
- NaN
- 2690-3202
- NaN
- NaN
- EUR
- NaN
-
-
- 45474
- NaN
- 1690
- 2020
- Source: OpenAPC (2020)
- 1
- 2691-9478
- NaN
- NaN
- NaN
- NaN
- EUR
- NaN
-
-
- 45475
- NaN
- 1523.2
- 2020
- Source: OpenAPC (2020)
- 1
- 2699-0016
- 2699-0016
- 2699-0016
- NaN
- NaN
- EUR
- NaN
-
-
- 45476
- NaN
- 305
- 2020
- Source: OpenAPC (2020)
- 1
- 2704-6192
- 2704-6192
- 2280-1855
- 2280-1855
- NaN
- EUR
- NaN
-
-
-
-
8935 rows × 12 columns
-
-
-
-
-
-```python
-#ajout des issn quand ça manque
-cost_factor.loc[cost_factor['issn'].isna(), 'issn'] = cost_factor['issn_print']
-cost_factor.loc[cost_factor['issn'].isna(), 'issn'] = cost_factor['issn_electronic']
-cost_factor.loc[cost_factor['issn'].isna(), 'issn'] = cost_factor['issn_link']
-cost_factor.loc[cost_factor['issn'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- APC amount
- amount
- apc_date
- comment
- cost_factor_type
- issn
- issn_electronic
- issn_link
- issn_print
- jdb_id
- symbol
- issnl
-
-
-
-
-
-
-
-
-
-
-```python
-#ajout des issnl quand ça manque
-cost_factor.loc[cost_factor['issnl'].isna(), 'issnl'] = cost_factor['issn_link']
-cost_factor.loc[cost_factor['issnl'].isna(), 'issnl'] = cost_factor['issn_print']
-cost_factor.loc[cost_factor['issnl'].isna(), 'issnl'] = cost_factor['issn_electronic']
-cost_factor.loc[cost_factor['issnl'].isna(), 'issnl'] = cost_factor['issn']
-cost_factor.loc[cost_factor['issnl'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- APC amount
- amount
- apc_date
- comment
- cost_factor_type
- issn
- issn_electronic
- issn_link
- issn_print
- jdb_id
- symbol
- issnl
-
-
-
-
-
-
-
-
-
-
-```python
-# prendre les ids pour le merge
-cost_factor_ids = cost_factor[['issn', 'issnl', 'cost_factor_type', 'amount', 'symbol', 'comment']]
-# cost_factor_ids_1 = cost_factor_ids_1.rename(columns = {'issn_link' : 'issn'})
-# cost_factor_ids_2 = cost_factor.loc[cost_factor['issn_electronic'].notna()][['issn_electronic', 'cost_factor_type', 'amount', 'symbol', 'comment']]
-# cost_factor_ids_2 = cost_factor_ids_2.rename(columns = {'issn_electronic' : 'issn'})
-# cost_factor_ids_3 = cost_factor.loc[cost_factor['issn_print'].notna()][['issn_print', 'cost_factor_type', 'amount', 'symbol', 'comment']]
-# cost_factor_ids_3 = cost_factor_ids_3.rename(columns = {'issn_print' : 'issn'})
-# cost_factor_ids_4 = cost_factor.loc[cost_factor['issn'].notna()][['issn', 'cost_factor_type', 'amount', 'symbol', 'comment']]
-# cost_factor_ids = cost_factor_ids_1.append(cost_factor_ids_2)
-# cost_factor_ids = cost_factor_ids.append(cost_factor_ids_3)
-# cost_factor_ids = cost_factor_ids.append(cost_factor_ids_4)
-cost_factor_ids
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- cost_factor_type
- amount
- symbol
- comment
-
-
-
-
- 0
- 1651-2057
- 0001-5555
- 1
- 1600
- EUR
- Source: DOAJ
-
-
- 1
- 2353-074X
- 0001-625X
- 1
- 400
- EUR
- Source: DOAJ
-
-
- 2
- 1873-6297
- 0001-6918
- 1
- 1500
- USD
- Source: DOAJ
-
-
- 3
- 2083-9480
- 0001-6977
- 1
- 520
- EUR
- Source: DOAJ
-
-
- 4
- 2327-9788
- 0003-1062
- 1
- 3500
- USD
- Source: DOAJ
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 45502
- 8756-7938
- 1520-6033
- 1
- 2400
- EUR
- Source: OpenAPC (2013)
-
-
- 45503
- 8756-7938
- 1520-6033
- 1
- 1822.49
- EUR
- Source: OpenAPC (2014)
-
-
- 45504
- 8756-7938
- 1520-6033
- 1
- 1762.69
- EUR
- Source: OpenAPC (2016)
-
-
- 45505
- 8756-7938
- 1520-6033
- 1
- 3248.31
- EUR
- Source: OpenAPC (2017)
-
-
- 45506
- 8756-7938
- 1520-6033
- 1
- 2913.11
- EUR
- Source: OpenAPC (2019)
-
-
-
-
45507 rows × 6 columns
-
-
-
-
-
-```python
-# supprimer les doublons et les vides
-cost_factor_ids = cost_factor_ids.drop_duplicates(subset=['issnl'])
-cost_factor_ids
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- cost_factor_type
- amount
- symbol
- comment
-
-
-
-
- 0
- 1651-2057
- 0001-5555
- 1
- 1600
- EUR
- Source: DOAJ
-
-
- 1
- 2353-074X
- 0001-625X
- 1
- 400
- EUR
- Source: DOAJ
-
-
- 2
- 1873-6297
- 0001-6918
- 1
- 1500
- USD
- Source: DOAJ
-
-
- 3
- 2083-9480
- 0001-6977
- 1
- 520
- EUR
- Source: DOAJ
-
-
- 4
- 2327-9788
- 0003-1062
- 1
- 3500
- USD
- Source: DOAJ
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 45473
- 2690-3202
- 2690-3202
- 1
- 754.67
- EUR
- Source: OpenAPC (2019)
-
-
- 45474
- 2691-9478
- 2691-9478
- 1
- 1690
- EUR
- Source: OpenAPC (2020)
-
-
- 45477
- 8750-7587
- 1522-1601
- 1
- 2355.13
- EUR
- Source: OpenAPC (2016)
-
-
- 45481
- 8755-1209
- 1944-9208
- 1
- 2627.74
- EUR
- Source: OpenAPC (2013)
-
-
- 45498
- 8756-758X
- 1460-2695
- 1
- 2725.08
- EUR
- Source: OpenAPC (2014)
-
-
-
-
24018 rows × 6 columns
-
-
-
-
-
-```python
-# merge dans l'autre sens pour garder que les lignes du fichier
-cost_factor_ids = pd.merge(cost_factor_ids, sherpa[['id', 'issnl']], on='issnl', how='left')
-cost_factor_ids
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- cost_factor_type
- amount
- symbol
- comment
- id
-
-
-
-
- 0
- 1651-2057
- 0001-5555
- 1
- 1600
- EUR
- Source: DOAJ
- NaN
-
-
- 1
- 2353-074X
- 0001-625X
- 1
- 400
- EUR
- Source: DOAJ
- NaN
-
-
- 2
- 1873-6297
- 0001-6918
- 1
- 1500
- USD
- Source: DOAJ
- NaN
-
-
- 3
- 2083-9480
- 0001-6977
- 1
- 520
- EUR
- Source: DOAJ
- NaN
-
-
- 4
- 2327-9788
- 0003-1062
- 1
- 3500
- USD
- Source: DOAJ
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 31397
- 2690-3202
- 2690-3202
- 1
- 754.67
- EUR
- Source: OpenAPC (2019)
- NaN
-
-
- 31398
- 2691-9478
- 2691-9478
- 1
- 1690
- EUR
- Source: OpenAPC (2020)
- NaN
-
-
- 31399
- 8750-7587
- 1522-1601
- 1
- 2355.13
- EUR
- Source: OpenAPC (2016)
- NaN
-
-
- 31400
- 8755-1209
- 1944-9208
- 1
- 2627.74
- EUR
- Source: OpenAPC (2013)
- NaN
-
-
- 31401
- 8756-758X
- 1460-2695
- 1
- 2725.08
- EUR
- Source: OpenAPC (2014)
- NaN
-
-
-
-
31402 rows × 7 columns
-
-
-
-
-
-```python
-# garder les lignes avec merge
-cost_factor_ids_all = cost_factor_ids.loc[cost_factor_ids['id'].notnull()]
-cost_factor_ids_all
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- cost_factor_type
- amount
- symbol
- comment
- id
-
-
-
-
- 23
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
- 1369.0
-
-
- 24
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
- 1370.0
-
-
- 25
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
- 1371.0
-
-
- 26
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
- 1372.0
-
-
- 31
- 1536-5964
- 0025-7974
- 1
- 1950
- USD
- Source: DOAJ
- 2147.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 31297
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 8591.0
-
-
- 31298
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 8592.0
-
-
- 31299
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 8593.0
-
-
- 31300
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 8594.0
-
-
- 31301
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 8595.0
-
-
-
-
7964 rows × 7 columns
-
-
-
-
-
-```python
-# supprimer les doublons
-cost_factor_ids_all = cost_factor_ids_all.drop_duplicates(subset=['id'])
-cost_factor_ids_all
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- cost_factor_type
- amount
- symbol
- comment
- id
-
-
-
-
- 23
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
- 1369.0
-
-
- 24
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
- 1370.0
-
-
- 25
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
- 1371.0
-
-
- 26
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
- 1372.0
-
-
- 31
- 1536-5964
- 0025-7974
- 1
- 1950
- USD
- Source: DOAJ
- 2147.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 31297
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 8591.0
-
-
- 31298
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 8592.0
-
-
- 31299
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 8593.0
-
-
- 31300
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 8594.0
-
-
- 31301
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 8595.0
-
-
-
-
7964 rows × 7 columns
-
-
-
-
-
-```python
-# supprimer les doublons par issnl
-cost_factor_ids_all = cost_factor_ids_all.drop_duplicates(subset=['issnl'])
-del cost_factor_ids_all['id']
-cost_factor_ids_all
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- cost_factor_type
- amount
- symbol
- comment
-
-
-
-
- 23
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
-
-
- 31
- 1536-5964
- 0025-7974
- 1
- 1950
- USD
- Source: DOAJ
-
-
- 222
- 1592-8721
- 0390-6078
- 1
- 2000
- EUR
- Source: DOAJ
-
-
- 303
- 1555-3892
- 0963-6897
- 1
- 2750
- USD
- Source: DOAJ
-
-
- 402
- 1095-9572
- 1053-8119
- 1
- 3000
- USD
- Source: DOAJ
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 31237
- 2469-9926
- 2469-9926
- 1
- 2156.51
- EUR
- Source: OpenAPC (2015)
-
-
- 31242
- 2469-9950
- 2469-9950
- 1
- 2143.51
- EUR
- Source: OpenAPC (2016)
-
-
- 31248
- 2470-0010
- 2470-0010
- 1
- 1763.13
- EUR
- Source: OpenAPC (2016)
-
-
- 31253
- 2470-0045
- 2470-0045
- 1
- 1211.45
- EUR
- Source: OpenAPC (2016)
-
-
- 31297
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
-
-
-
-
580 rows × 6 columns
-
-
-
-
-
-```python
-# convertir l'index en id
-cost_factor_ids_all = cost_factor_ids_all.reset_index()
-# ajout de l'id avec l'index + 1
-cost_factor_ids_all['cost_factor'] = cost_factor_ids_all['index'] + id_start
-del cost_factor_ids_all['index']
-# convertir l'index en id
-cost_factor_ids_all = cost_factor_ids_all.reset_index()
-# ajout de l'id avec l'index + 1
-cost_factor_ids_all['cost_factor'] = cost_factor_ids_all['index'] + id_start
-del cost_factor_ids_all['index']
-cost_factor_ids_all
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- cost_factor_type
- amount
- symbol
- comment
- cost_factor
-
-
-
-
- 0
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
- 1
-
-
- 1
- 1536-5964
- 0025-7974
- 1
- 1950
- USD
- Source: DOAJ
- 2
-
-
- 2
- 1592-8721
- 0390-6078
- 1
- 2000
- EUR
- Source: DOAJ
- 3
-
-
- 3
- 1555-3892
- 0963-6897
- 1
- 2750
- USD
- Source: DOAJ
- 4
-
-
- 4
- 1095-9572
- 1053-8119
- 1
- 3000
- USD
- Source: DOAJ
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 575
- 2469-9926
- 2469-9926
- 1
- 2156.51
- EUR
- Source: OpenAPC (2015)
- 576
-
-
- 576
- 2469-9950
- 2469-9950
- 1
- 2143.51
- EUR
- Source: OpenAPC (2016)
- 577
-
-
- 577
- 2470-0010
- 2470-0010
- 1
- 1763.13
- EUR
- Source: OpenAPC (2016)
- 578
-
-
- 578
- 2470-0045
- 2470-0045
- 1
- 1211.45
- EUR
- Source: OpenAPC (2016)
- 579
-
-
- 579
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 580
-
-
-
-
580 rows × 7 columns
-
-
-
-
-
-```python
-# merge avec la table sherpa
-sherpa = pd.merge(sherpa, cost_factor_ids_all[['issnl', 'cost_factor']], on='issnl', how='left')
-sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- sherpa_code
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
- issnl
- version
- licence
- cost_factor
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
- 0001-2815
- 1
- NaN
- 355.0
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
- NaN
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
- 0001-2815
- 2
- NaN
- 355.0
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
- 0001-2815
- 3
- 1.0
- 355.0
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
- 0001-2815
- 3
- 2.0
- 355.0
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
- 0001-4842
- 1
- NaN
- 356.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
- 2475-9953
- 1
- NaN
- 580.0
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
- 2475-9953
- 2
- NaN
- 580.0
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
- 2475-9953
- 3
- NaN
- 580.0
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
- 2475-9953
- 3
- 1.0
- 580.0
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
- 2475-9953
- 3
- 1.0
- 580.0
-
-
-
-
8595 rows × 33 columns
-
-
-
-
-
-```python
-sherpa.loc[sherpa['cost_factor'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- sherpa_code
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
- issnl
- version
- licence
- cost_factor
-
-
-
-
- 93
- 787
- 0002-9513
- 7391
- https://v2.sherpa.ac.uk/id/publisher_policy/11
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository
- NaN
- arXiv ; bioRxiv ; Preprint Repository
- arXiv ; bioRxiv
- NaN
- authors
- NaN
- False
- Must be assigned a DOI
- Can not be deposited after submission to journal
- 94
- 0002-9513
- 1
- NaN
- NaN
-
-
- 94
- 787
- 0002-9513
- 7391
- https://v2.sherpa.ac.uk/id/publisher_policy/11
- no
- no
- accepted
- NaN
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- institutional_repository
- Institutional Repository
- NaN
- NaN
- NaN
- publishers
- NaN
- True
- Must link to publisher version with DOI
- NaN
- 95
- 0002-9513
- 2
- NaN
- NaN
-
-
- 95
- 787
- 0002-9513
- 7391
- https://v2.sherpa.ac.uk/id/publisher_policy/11
- no
- no
- published
- NaN
- 12
- NaN
- True
- National Institutes of Health (NIH)
- http://dx.doi.org/10.13039/100000002
- https://ror.org/01cwqze88
- us
- http://www.nih.gov/
- 9.0
- NaN
- named_repository
- NaN
- PubMed Central
- PubMed Central
- NaN
- publishers
- disciplinary (PubMed Central) ;
- False
- Must link to publisher version with DOI
- NaN
- 96
- 0002-9513
- 3
- NaN
- NaN
-
-
- 96
- 787
- 0002-9513
- 7391
- https://v2.sherpa.ac.uk/id/publisher_policy/11
- no
- no
- published
- NaN
- 12
- NaN
- True
- Wellcome Trust
- http://dx.doi.org/10.13039/100004440
- https://ror.org/029chgv08
- gb
- http://www.wellcome.ac.uk/
- 695.0
- NaN
- named_repository
- NaN
- PubMed Central
- PubMed Central
- NaN
- publishers
- disciplinary (PubMed Central) ;
- False
- Must link to publisher version with DOI
- NaN
- 97
- 0002-9513
- 3
- NaN
- NaN
-
-
- 97
- 787
- 0002-9513
- 7391
- https://v2.sherpa.ac.uk/id/publisher_policy/11
- no
- no
- published
- NaN
- 12
- NaN
- True
- Medical Research Council (MRC)
- http://dx.doi.org/10.13039/501100000265
- https://ror.org/03x94j517
- gb
- http://www.mrc.ac.uk/index.htm
- 705.0
- NaN
- named_repository
- NaN
- PubMed Central
- PubMed Central
- NaN
- publishers
- disciplinary (PubMed Central) ;
- False
- Must link to publisher version with DOI
- NaN
- 98
- 0002-9513
- 3
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8199
- 565
- 1661-8157
- 8459
- https://v2.sherpa.ac.uk/id/publisher_policy/3494
- no
- yes
- published
- cc_by_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; subject_repos...
- Any Website
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- NaN
- NaN
- True
- Published source must be acknowledged with cit...
- NaN
- 8200
- 1661-8157
- 3
- 8.0
- NaN
-
-
- 8200
- 565
- 1661-8157
- 8459
- https://v2.sherpa.ac.uk/id/publisher_policy/3494
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; subject_repos...
- Any Website
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- NaN
- NaN
- True
- Published source must be acknowledged with cit...
- NaN
- 8201
- 1661-8157
- 3
- 2.0
- NaN
-
-
- 8373
- 530
- 1946-6234
- 11116
- https://v2.sherpa.ac.uk/id/publisher_policy/3
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository
- NaN
- arXiv ; bioRxiv ; Preprint Repository
- arXiv ; bioRxiv
- NaN
- NaN
- NaN
- False
- May be considered prior publication, contact j...
- NaN
- 8374
- 1946-6234
- 1
- NaN
- NaN
-
-
- 8374
- 530
- 1946-6234
- 11116
- https://v2.sherpa.ac.uk/id/publisher_policy/3
- no
- no
- accepted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository
- Institutional Repository
- Author's Homepage
- NaN
- NaN
- authors
- NaN
- True
- Published source must be acknowledged with DOI...
- NaN
- 8375
- 1946-6234
- 2
- NaN
- NaN
-
-
- 8375
- 530
- 1946-6234
- 11116
- https://v2.sherpa.ac.uk/id/publisher_policy/3
- no
- no
- accepted
- NaN
- 6
- when_required_by_funder
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- funder_designated_location ; named_repository
- NaN
- Funder Designated Location ; PubMed Central
- PubMed Central
- NaN
- authors
- NaN
- False
- Must state on submission Funding agency requir...
- NaN
- 8376
- 1946-6234
- 2
- NaN
- NaN
-
-
-
-
631 rows × 33 columns
-
-
-
-
-
-```python
-# garder les APCs pour la version published
-sherpa.loc[sherpa['article_version'] != 'published', 'cost_factor'] = np.nan
-sherpa.loc[sherpa['cost_factor'].notna()]
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- sherpa_code
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
- issnl
- version
- licence
- cost_factor
-
-
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
- 0001-2815
- 3
- 1.0
- 355.0
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
- 0001-2815
- 3
- 2.0
- 355.0
-
-
- 6
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- funder_designated_location ; named_repository ...
- NaN
- Funder Designated Location ; PubMed Central ; ...
- PubMed Central
- NaN
- publishers
- disciplinary (PubMed Central) ;
- False
- NaN
- NaN
- 7
- 0001-4842
- 3
- 1.0
- 356.0
-
-
- 7
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- funder_designated_location ; named_repository ...
- NaN
- Funder Designated Location ; PubMed Central ; ...
- PubMed Central
- NaN
- publishers
- disciplinary (PubMed Central) ;
- False
- NaN
- NaN
- 8
- 0001-4842
- 3
- 2.0
- 356.0
-
-
- 8
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- yes
- published
- bespoke_license
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- funder_designated_location ; named_repository ...
- NaN
- Funder Designated Location ; PubMed Central ; ...
- PubMed Central
- NaN
- publishers
- disciplinary (PubMed Central) ;
- False
- NaN
- NaN
- 9
- 0001-4842
- 3
- 3.0
- 356.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8588
- 533
- 2470-0045
- 31531
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8589
- 2470-0045
- 3
- 1.0
- 579.0
-
-
- 8589
- 533
- 2470-0045
- 31531
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8590
- 2470-0045
- 3
- 1.0
- 579.0
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
- 2475-9953
- 3
- NaN
- 580.0
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
- 2475-9953
- 3
- 1.0
- 580.0
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
- 2475-9953
- 3
- 1.0
- 580.0
-
-
-
-
4462 rows × 33 columns
-
-
-
-
-
-```python
-# renommer l'id du fichier sherpa brut
-# cost_factor_ids_all = cost_factor_ids_all.rename(columns = {'id' : 'id_sherpa'})
-cost_factor_ids_all = cost_factor_ids_all.rename(columns = {'cost_factor' : 'id'})
-cost_factor_ids_all
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- cost_factor_type
- amount
- symbol
- comment
- id
-
-
-
-
- 0
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
- 1
-
-
- 1
- 1536-5964
- 0025-7974
- 1
- 1950
- USD
- Source: DOAJ
- 2
-
-
- 2
- 1592-8721
- 0390-6078
- 1
- 2000
- EUR
- Source: DOAJ
- 3
-
-
- 3
- 1555-3892
- 0963-6897
- 1
- 2750
- USD
- Source: DOAJ
- 4
-
-
- 4
- 1095-9572
- 1053-8119
- 1
- 3000
- USD
- Source: DOAJ
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 575
- 2469-9926
- 2469-9926
- 1
- 2156.51
- EUR
- Source: OpenAPC (2015)
- 576
-
-
- 576
- 2469-9950
- 2469-9950
- 1
- 2143.51
- EUR
- Source: OpenAPC (2016)
- 577
-
-
- 577
- 2470-0010
- 2470-0010
- 1
- 1763.13
- EUR
- Source: OpenAPC (2016)
- 578
-
-
- 578
- 2470-0045
- 2470-0045
- 1
- 1211.45
- EUR
- Source: OpenAPC (2016)
- 579
-
-
- 579
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 580
-
-
-
-
580 rows × 7 columns
-
-
-
-
-
-```python
-cost_factor_ids_all['id'] = cost_factor_ids_all['id'].astype(int)
-```
-
-
-```python
-cost_factor_ids_all
-```
-
-
-
-
-
-
-
-
-
-
- issn
- issnl
- cost_factor_type
- amount
- symbol
- comment
- id
-
-
-
-
- 0
- 1083-351X
- 0021-9258
- 1
- 2500
- USD
- Source: DOAJ
- 1
-
-
- 1
- 1536-5964
- 0025-7974
- 1
- 1950
- USD
- Source: DOAJ
- 2
-
-
- 2
- 1592-8721
- 0390-6078
- 1
- 2000
- EUR
- Source: DOAJ
- 3
-
-
- 3
- 1555-3892
- 0963-6897
- 1
- 2750
- USD
- Source: DOAJ
- 4
-
-
- 4
- 1095-9572
- 1053-8119
- 1
- 3000
- USD
- Source: DOAJ
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 575
- 2469-9926
- 2469-9926
- 1
- 2156.51
- EUR
- Source: OpenAPC (2015)
- 576
-
-
- 576
- 2469-9950
- 2469-9950
- 1
- 2143.51
- EUR
- Source: OpenAPC (2016)
- 577
-
-
- 577
- 2470-0010
- 2470-0010
- 1
- 1763.13
- EUR
- Source: OpenAPC (2016)
- 578
-
-
- 578
- 2470-0045
- 2470-0045
- 1
- 1211.45
- EUR
- Source: OpenAPC (2016)
- 579
-
-
- 579
- 2475-9953
- 2475-9953
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
- 580
-
-
-
-
580 rows × 7 columns
-
-
-
-
-
-```python
-cost_factor_export = cost_factor_ids_all[['id', 'cost_factor_type', 'amount', 'symbol', 'comment']]
-cost_factor_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- cost_factor_type
- amount
- symbol
- comment
-
-
-
-
- 0
- 1
- 1
- 2500
- USD
- Source: DOAJ
-
-
- 1
- 2
- 1
- 1950
- USD
- Source: DOAJ
-
-
- 2
- 3
- 1
- 2000
- EUR
- Source: DOAJ
-
-
- 3
- 4
- 1
- 2750
- USD
- Source: DOAJ
-
-
- 4
- 5
- 1
- 3000
- USD
- Source: DOAJ
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 575
- 576
- 1
- 2156.51
- EUR
- Source: OpenAPC (2015)
-
-
- 576
- 577
- 1
- 2143.51
- EUR
- Source: OpenAPC (2016)
-
-
- 577
- 578
- 1
- 1763.13
- EUR
- Source: OpenAPC (2016)
-
-
- 578
- 579
- 1
- 1211.45
- EUR
- Source: OpenAPC (2016)
-
-
- 579
- 580
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
-
-
-
-
580 rows × 5 columns
-
-
-
-
-
-```python
-cost_factor_export.shape[0]
-```
-
-
-
-
- 580
-
-
-
-
-```python
-# ajout de la valeur Rabais 100% pour les licences Read & Publish
-rpid = cost_factor_export.shape[0] + 1
-cost_factor_export = cost_factor_export.append({'id' : rpid, 'cost_factor_type' : 2, 'amount' : 100, 'symbol' : '%', 'comment' : 'Read & Publish agreement'}, ignore_index=True)
-cost_factor_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- cost_factor_type
- amount
- symbol
- comment
-
-
-
-
- 0
- 1
- 1
- 2500
- USD
- Source: DOAJ
-
-
- 1
- 2
- 1
- 1950
- USD
- Source: DOAJ
-
-
- 2
- 3
- 1
- 2000
- EUR
- Source: DOAJ
-
-
- 3
- 4
- 1
- 2750
- USD
- Source: DOAJ
-
-
- 4
- 5
- 1
- 3000
- USD
- Source: DOAJ
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 576
- 577
- 1
- 2143.51
- EUR
- Source: OpenAPC (2016)
-
-
- 577
- 578
- 1
- 1763.13
- EUR
- Source: OpenAPC (2016)
-
-
- 578
- 579
- 1
- 1211.45
- EUR
- Source: OpenAPC (2016)
-
-
- 579
- 580
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
-
-
- 580
- 581
- 2
- 100
- %
- Read & Publish agreement
-
-
-
-
581 rows × 5 columns
-
-
-
-
-
-```python
-# ajout de l'id dans la table read & publish
-rp['cost_factor'] = rpid
-rp
-```
-
-
-
-
-
-
-
-
-
-
- issn
- title
- archiving
- article_version
- embargo_months
- sherpa_code
- valid_from
- valid_until
- issnl
- ror
- journal
- rp_id
- rp_publisher
- version
- licence
- cost_factor
-
-
-
-
- 0
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
- Elsevier
- 3
- 1
- 581
-
-
- 1
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
- Elsevier
- 3
- 1
- 581
-
-
- 2
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
- Elsevier
- 3
- 1
- 581
-
-
- 3
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
- Elsevier
- 3
- 1
- 581
-
-
- 4
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
- Elsevier
- 3
- 1
- 581
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
- CUP
- 3
- 5
- 581
-
-
- 40079
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
- CUP
- 3
- 5
- 581
-
-
- 40080
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
- CUP
- 3
- 5
- 581
-
-
- 40081
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
- CUP
- 3
- 5
- 581
-
-
- 40082
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
- CUP
- 3
- 5
- 581
-
-
-
-
40083 rows × 16 columns
-
-
-
-
-
-```python
-# ajout de la valeur UNKNOWN
-cost_factor_export = cost_factor_export.append({'id' : 999999, 'cost_factor_type' : 999999, 'amount' : 0, 'symbol' : '', 'comment' : 'UNKNOWN'}, ignore_index=True)
-cost_factor_export
-```
-
-
-
-
-
-
-
-
-
-
- id
- cost_factor_type
- amount
- symbol
- comment
-
-
-
-
- 0
- 1
- 1
- 2500
- USD
- Source: DOAJ
-
-
- 1
- 2
- 1
- 1950
- USD
- Source: DOAJ
-
-
- 2
- 3
- 1
- 2000
- EUR
- Source: DOAJ
-
-
- 3
- 4
- 1
- 2750
- USD
- Source: DOAJ
-
-
- 4
- 5
- 1
- 3000
- USD
- Source: DOAJ
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 577
- 578
- 1
- 1763.13
- EUR
- Source: OpenAPC (2016)
-
-
- 578
- 579
- 1
- 1211.45
- EUR
- Source: OpenAPC (2016)
-
-
- 579
- 580
- 1
- 2023.37
- EUR
- Source: OpenAPC (2017)
-
-
- 580
- 581
- 2
- 100
- %
- Read & Publish agreement
-
-
- 581
- 999999
- 999999
- 0
-
- UNKNOWN
-
-
-
-
582 rows × 5 columns
-
-
-
-
-
-```python
-# export de la table
-result = cost_factor_export.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/cost_factor.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-cost_factor_export.to_csv('sample/cost_factor.tsv', index=False)
-```
-
-
-```python
-# export excel
-cost_factor_export.to_excel('sample/cost_factor.xlsx', index=False)
-```
-
-## Table term
-
-
-```python
-sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- sherpa_code
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
- issnl
- version
- licence
- cost_factor
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
- 0001-2815
- 1
- NaN
- NaN
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
- NaN
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
- 0001-2815
- 2
- NaN
- NaN
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
- 0001-2815
- 3
- 1.0
- 355.0
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
- 0001-2815
- 3
- 2.0
- 355.0
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
- 0001-4842
- 1
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
- 2475-9953
- 1
- NaN
- NaN
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
- 2475-9953
- 2
- NaN
- NaN
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
- 2475-9953
- 3
- NaN
- 580.0
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
- 2475-9953
- 3
- 1.0
- 580.0
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
- 2475-9953
- 3
- 1.0
- 580.0
-
-
-
-
8595 rows × 33 columns
-
-
-
-
-
-```python
-# col_names = ['id', 'applicable_version', 'cost_factor', 'embargo', 'archiving']
-term_sherpa = sherpa[['id', 'version', 'cost_factor', 'embargo', 'archiving', 'locations_ir', 'locations_not_ir', 'licence', 'journal', 'conditions', 'public_notes', 'prerequisite_funders', 'prerequisite_funders_ror']]
-term_sherpa
-```
-
-
-
-
-
-
-
-
-
-
- id
- version
- cost_factor
- embargo
- archiving
- locations_ir
- locations_not_ir
- licence
- journal
- conditions
- public_notes
- prerequisite_funders
- prerequisite_funders_ror
-
-
-
-
- 0
- 1
- 1
- NaN
- 0
- True
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- NaN
- 532
- Must acknowledge acceptance for publication ; ...
- NaN
- NaN
- NaN
-
-
- 1
- 2
- 2
- NaN
- 12
- True
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- NaN
- 532
- Publisher source must be acknowledged with cit...
- NaN
- NaN
- NaN
-
-
- 2
- 3
- 3
- 355.0
- 0
- True
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- 1.0
- 532
- Published source must be acknowledged
- NaN
- NaN
- NaN
-
-
- 3
- 4
- 3
- 355.0
- 0
- True
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- 2.0
- 532
- Published source must be acknowledged
- NaN
- NaN
- NaN
-
-
- 4
- 5
- 1
- NaN
- 0
- False
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- NaN
- 498
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 8591
- 1
- NaN
- 0
- True
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- 608
- Must link to published article ; Publisher cop...
- NaN
- NaN
- NaN
-
-
- 8591
- 8592
- 2
- NaN
- 0
- True
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- 608
- Must link to published article ; Publisher cop...
- NaN
- NaN
- NaN
-
-
- 8592
- 8593
- 3
- 580.0
- 0
- True
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- 608
- Must link to published article ; Publisher cop...
- NaN
- NaN
- NaN
-
-
- 8593
- 8594
- 3
- 580.0
- 0
- True
- Any Repository
- Journal Website
- 1.0
- 608
- NaN
- NaN
- NaN
- NaN
-
-
- 8594
- 8595
- 3
- 580.0
- 0
- True
- Any Repository
- Journal Website
- 1.0
- 608
- NaN
- NaN
- NaN
- NaN
-
-
-
-
8595 rows × 13 columns
-
-
-
-
-
-```python
-# renommer les champs
-term_sherpa = term_sherpa.rename(columns = {'id' : 'id_sherpa', 'embargo' : 'embargo_months', 'prerequisite_funders_ror' : 'ror'})
-term_sherpa
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- locations_ir
- locations_not_ir
- licence
- journal
- conditions
- public_notes
- prerequisite_funders
- ror
-
-
-
-
- 0
- 1
- 1
- NaN
- 0
- True
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- NaN
- 532
- Must acknowledge acceptance for publication ; ...
- NaN
- NaN
- NaN
-
-
- 1
- 2
- 2
- NaN
- 12
- True
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- NaN
- 532
- Publisher source must be acknowledged with cit...
- NaN
- NaN
- NaN
-
-
- 2
- 3
- 3
- 355.0
- 0
- True
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- 1.0
- 532
- Published source must be acknowledged
- NaN
- NaN
- NaN
-
-
- 3
- 4
- 3
- 355.0
- 0
- True
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- 2.0
- 532
- Published source must be acknowledged
- NaN
- NaN
- NaN
-
-
- 4
- 5
- 1
- NaN
- 0
- False
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- NaN
- 498
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 8591
- 1
- NaN
- 0
- True
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- 608
- Must link to published article ; Publisher cop...
- NaN
- NaN
- NaN
-
-
- 8591
- 8592
- 2
- NaN
- 0
- True
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- 608
- Must link to published article ; Publisher cop...
- NaN
- NaN
- NaN
-
-
- 8592
- 8593
- 3
- 580.0
- 0
- True
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- 608
- Must link to published article ; Publisher cop...
- NaN
- NaN
- NaN
-
-
- 8593
- 8594
- 3
- 580.0
- 0
- True
- Any Repository
- Journal Website
- 1.0
- 608
- NaN
- NaN
- NaN
- NaN
-
-
- 8594
- 8595
- 3
- 580.0
- 0
- True
- Any Repository
- Journal Website
- 1.0
- 608
- NaN
- NaN
- NaN
- NaN
-
-
-
-
8595 rows × 13 columns
-
-
-
-
-
-```python
-# merge des champs dans le comment : conditions, public_notes, locations_not_ir
-term_sherpa['conditions'] = term_sherpa['conditions'].fillna('')
-term_sherpa['public_notes'] = term_sherpa['public_notes'].fillna('')
-term_sherpa['locations_not_ir'] = term_sherpa['locations_not_ir'].fillna('')
-term_sherpa['locations_ir'] = term_sherpa['locations_ir'].fillna('')
-term_sherpa.loc[term_sherpa['locations_not_ir'] != '', 'locations_not_ir'] = 'Non institutional archiving locations: ' + term_sherpa['locations_not_ir']
-term_sherpa.loc[term_sherpa['locations_ir'] != '', 'locations_ir'] = 'Institutional archiving locations: ' + term_sherpa['locations_ir']
-term_sherpa.loc[term_sherpa['archiving'] == False, 'comment'] = term_sherpa['locations_not_ir']
-term_sherpa.loc[term_sherpa['archiving'] == True, 'comment'] = term_sherpa['locations_ir']
-term_sherpa.loc[term_sherpa['comment'] == '', 'comment'] = 'Conditions: ' + term_sherpa['conditions']
-term_sherpa.loc[(term_sherpa['comment'] != '') & (term_sherpa['conditions'] != ''), 'comment'] = term_sherpa['comment'] + ' ; Conditions: ' + term_sherpa['conditions']
-term_sherpa.loc[(term_sherpa['public_notes'] != '') & (term_sherpa['public_notes'] != term_sherpa['comment']), 'comment'] = term_sherpa['comment'] + ' ; Public notes: ' + term_sherpa['public_notes']
-term_sherpa.loc[(term_sherpa['public_notes'] != '') & (term_sherpa['comment'] == ''), 'comment'] = 'Public notes: ' + term_sherpa['public_notes']
-term_sherpa
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- locations_ir
- locations_not_ir
- licence
- journal
- conditions
- public_notes
- prerequisite_funders
- ror
- comment
-
-
-
-
- 0
- 1
- 1
- NaN
- 0
- True
- Institutional archiving locations: Non-Commerc...
- Non institutional archiving locations: Author'...
- NaN
- 532
- Must acknowledge acceptance for publication ; ...
-
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
-
-
- 1
- 2
- 2
- NaN
- 12
- True
- Institutional archiving locations: Non-Commerc...
- Non institutional archiving locations: Author'...
- NaN
- 532
- Publisher source must be acknowledged with cit...
-
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
-
-
- 2
- 3
- 3
- 355.0
- 0
- True
- Institutional archiving locations: Any Website...
- Non institutional archiving locations: PubMed ...
- 1.0
- 532
- Published source must be acknowledged
-
- NaN
- NaN
- Institutional archiving locations: Any Website...
-
-
- 3
- 4
- 3
- 355.0
- 0
- True
- Institutional archiving locations: Any Website...
- Non institutional archiving locations: PubMed ...
- 2.0
- 532
- Published source must be acknowledged
-
- NaN
- NaN
- Institutional archiving locations: Any Website...
-
-
- 4
- 5
- 1
- NaN
- 0
- False
-
- Non institutional archiving locations: ChemRxi...
- NaN
- 498
- Must not violate ACS ethical Guidelines ; Must...
-
- NaN
- NaN
- Non institutional archiving locations: ChemRxi...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 8591
- 1
- NaN
- 0
- True
- Institutional archiving locations: Institution...
- Non institutional archiving locations: Author'...
- NaN
- 608
- Must link to published article ; Publisher cop...
-
- NaN
- NaN
- Institutional archiving locations: Institution...
-
-
- 8591
- 8592
- 2
- NaN
- 0
- True
- Institutional archiving locations: Institution...
- Non institutional archiving locations: Author'...
- NaN
- 608
- Must link to published article ; Publisher cop...
-
- NaN
- NaN
- Institutional archiving locations: Institution...
-
-
- 8592
- 8593
- 3
- 580.0
- 0
- True
- Institutional archiving locations: Institution...
- Non institutional archiving locations: Author'...
- NaN
- 608
- Must link to published article ; Publisher cop...
-
- NaN
- NaN
- Institutional archiving locations: Institution...
-
-
- 8593
- 8594
- 3
- 580.0
- 0
- True
- Institutional archiving locations: Any Repository
- Non institutional archiving locations: Journal...
- 1.0
- 608
-
-
- NaN
- NaN
- Institutional archiving locations: Any Repository
-
-
- 8594
- 8595
- 3
- 580.0
- 0
- True
- Institutional archiving locations: Any Repository
- Non institutional archiving locations: Journal...
- 1.0
- 608
-
-
- NaN
- NaN
- Institutional archiving locations: Any Repository
-
-
-
-
8595 rows × 14 columns
-
-
-
-
-
-```python
-term_sherpa['prerequisite_funders'].value_counts()
-```
-
-
-
-
- True 5585
- Name: prerequisite_funders, dtype: int64
-
-
-
-
-```python
-rp
-```
-
-
-
-
-
-
-
-
-
-
- issn
- title
- archiving
- article_version
- embargo_months
- sherpa_code
- valid_from
- valid_until
- issnl
- ror
- journal
- rp_id
- rp_publisher
- version
- licence
- cost_factor
-
-
-
-
- 0
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/04d8ztx87
- 899.0
- 1
- Elsevier
- 3
- 1
- 581
-
-
- 1
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02bnkt322
- 899.0
- 2
- Elsevier
- 3
- 1
- 581
-
-
- 2
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/00zg4za48
- 899.0
- 3
- Elsevier
- 3
- 1
- 581
-
-
- 3
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/02s376052
- 899.0
- 4
- Elsevier
- 3
- 1
- 581
-
-
- 4
- 1742-7061
- Acta Biomaterialia
- True
- published
- 0
- cc_by
- 2020-01-01
- 2023-12-31
- 1742-7061
- https://ror.org/05a28rw58
- 899.0
- 5
- Elsevier
- 3
- 1
- 581
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/01swzsf04
- 592.0
- 40079
- CUP
- 3
- 5
- 581
-
-
- 40079
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/019whta54
- 592.0
- 40080
- CUP
- 3
- 5
- 581
-
-
- 40080
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/00vasag41
- 592.0
- 40081
- CUP
- 3
- 5
- 581
-
-
- 40081
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05r0ap620
- 592.0
- 40082
- CUP
- 3
- 5
- 581
-
-
- 40082
- 1435-8115
- Microscopy and Microanalysis
- True
- published
- 60
- cc_by_nc_sa
- 2021-01-01
- 2023-12-31
- 1431-9276
- https://ror.org/05pmsvm27
- 592.0
- 40083
- CUP
- 3
- 5
- 581
-
-
-
-
40083 rows × 16 columns
-
-
-
-
-
-```python
-term_rp = rp[['rp_id', 'version', 'archiving', 'embargo_months', 'cost_factor', 'licence', 'journal', 'rp_publisher', 'ror', 'valid_from', 'valid_until']]
-term_rp
-```
-
-
-
-
-
-
-
-
-
-
- rp_id
- version
- archiving
- embargo_months
- cost_factor
- licence
- journal
- rp_publisher
- ror
- valid_from
- valid_until
-
-
-
-
- 0
- 1
- 3
- True
- 0
- 581
- 1
- 899.0
- Elsevier
- https://ror.org/04d8ztx87
- 2020-01-01
- 2023-12-31
-
-
- 1
- 2
- 3
- True
- 0
- 581
- 1
- 899.0
- Elsevier
- https://ror.org/02bnkt322
- 2020-01-01
- 2023-12-31
-
-
- 2
- 3
- 3
- True
- 0
- 581
- 1
- 899.0
- Elsevier
- https://ror.org/00zg4za48
- 2020-01-01
- 2023-12-31
-
-
- 3
- 4
- 3
- True
- 0
- 581
- 1
- 899.0
- Elsevier
- https://ror.org/02s376052
- 2020-01-01
- 2023-12-31
-
-
- 4
- 5
- 3
- True
- 0
- 581
- 1
- 899.0
- Elsevier
- https://ror.org/05a28rw58
- 2020-01-01
- 2023-12-31
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 40079
- 3
- True
- 60
- 581
- 5
- 592.0
- CUP
- https://ror.org/01swzsf04
- 2021-01-01
- 2023-12-31
-
-
- 40079
- 40080
- 3
- True
- 60
- 581
- 5
- 592.0
- CUP
- https://ror.org/019whta54
- 2021-01-01
- 2023-12-31
-
-
- 40080
- 40081
- 3
- True
- 60
- 581
- 5
- 592.0
- CUP
- https://ror.org/00vasag41
- 2021-01-01
- 2023-12-31
-
-
- 40081
- 40082
- 3
- True
- 60
- 581
- 5
- 592.0
- CUP
- https://ror.org/05r0ap620
- 2021-01-01
- 2023-12-31
-
-
- 40082
- 40083
- 3
- True
- 60
- 581
- 5
- 592.0
- CUP
- https://ror.org/05pmsvm27
- 2021-01-01
- 2023-12-31
-
-
-
-
40083 rows × 11 columns
-
-
-
-
-
-```python
-term_rp['rp_publisher'].value_counts()
-```
-
-
-
-
- Elsevier 18128
- Wiley 13905
- Springer Nature 6716
- CUP 920
- TF 414
- Name: rp_publisher, dtype: int64
-
-
-
-
-```python
-term_rp.loc[term_rp['rp_publisher'] == 'Elsevier', 'comment'] = 'Elsevier Read & Publish agreement'
-term_rp.loc[term_rp['rp_publisher'] == 'Wiley', 'comment'] = 'Wiley Read & Publish agreement'
-term_rp.loc[term_rp['rp_publisher'] == 'TF', 'comment'] = 'Taylor and Francis Read & Publish agreement'
-term_rp.loc[term_rp['rp_publisher'] == 'Springer Nature ', 'comment'] = 'Springer Nature Read & Publish agreement'
-term_rp.loc[term_rp['rp_publisher'] == 'CUP', 'comment'] = 'Cambridge University Press (CUP) Read & Publish agreement. Article types covered: Research Articles, Review Articles, Rapid Communication, Brief Reports and Case Reports'
-del term_rp['rp_publisher']
-term_rp
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py:376: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- self.obj[key] = _infer_fill_value(value)
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py:494: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- self.obj[item] = s
-
-
-
-
-
-
-
-
-
-
-
- rp_id
- version
- archiving
- embargo_months
- cost_factor
- licence
- journal
- ror
- valid_from
- valid_until
- comment
-
-
-
-
- 0
- 1
- 3
- True
- 0
- 581
- 1
- 899.0
- https://ror.org/04d8ztx87
- 2020-01-01
- 2023-12-31
- Elsevier Read & Publish agreement
-
-
- 1
- 2
- 3
- True
- 0
- 581
- 1
- 899.0
- https://ror.org/02bnkt322
- 2020-01-01
- 2023-12-31
- Elsevier Read & Publish agreement
-
-
- 2
- 3
- 3
- True
- 0
- 581
- 1
- 899.0
- https://ror.org/00zg4za48
- 2020-01-01
- 2023-12-31
- Elsevier Read & Publish agreement
-
-
- 3
- 4
- 3
- True
- 0
- 581
- 1
- 899.0
- https://ror.org/02s376052
- 2020-01-01
- 2023-12-31
- Elsevier Read & Publish agreement
-
-
- 4
- 5
- 3
- True
- 0
- 581
- 1
- 899.0
- https://ror.org/05a28rw58
- 2020-01-01
- 2023-12-31
- Elsevier Read & Publish agreement
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 40079
- 3
- True
- 60
- 581
- 5
- 592.0
- https://ror.org/01swzsf04
- 2021-01-01
- 2023-12-31
- Cambridge University Press (CUP) Read & Publis...
-
-
- 40079
- 40080
- 3
- True
- 60
- 581
- 5
- 592.0
- https://ror.org/019whta54
- 2021-01-01
- 2023-12-31
- Cambridge University Press (CUP) Read & Publis...
-
-
- 40080
- 40081
- 3
- True
- 60
- 581
- 5
- 592.0
- https://ror.org/00vasag41
- 2021-01-01
- 2023-12-31
- Cambridge University Press (CUP) Read & Publis...
-
-
- 40081
- 40082
- 3
- True
- 60
- 581
- 5
- 592.0
- https://ror.org/05r0ap620
- 2021-01-01
- 2023-12-31
- Cambridge University Press (CUP) Read & Publis...
-
-
- 40082
- 40083
- 3
- True
- 60
- 581
- 5
- 592.0
- https://ror.org/05pmsvm27
- 2021-01-01
- 2023-12-31
- Cambridge University Press (CUP) Read & Publis...
-
-
-
-
40083 rows × 11 columns
-
-
-
-
-
-```python
-# cocnat de deux tables
-term_orig = term_sherpa[['id_sherpa', 'version', 'cost_factor', 'embargo_months', 'archiving', 'licence', 'journal', 'prerequisite_funders', 'ror', 'comment']]
-term_orig
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
-
-
-
-
- 0
- 1
- 1
- NaN
- 0
- True
- NaN
- 532
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
-
-
- 1
- 2
- 2
- NaN
- 12
- True
- NaN
- 532
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
-
-
- 2
- 3
- 3
- 355.0
- 0
- True
- 1.0
- 532
- NaN
- NaN
- Institutional archiving locations: Any Website...
-
-
- 3
- 4
- 3
- 355.0
- 0
- True
- 2.0
- 532
- NaN
- NaN
- Institutional archiving locations: Any Website...
-
-
- 4
- 5
- 1
- NaN
- 0
- False
- NaN
- 498
- NaN
- NaN
- Non institutional archiving locations: ChemRxi...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 8591
- 1
- NaN
- 0
- True
- NaN
- 608
- NaN
- NaN
- Institutional archiving locations: Institution...
-
-
- 8591
- 8592
- 2
- NaN
- 0
- True
- NaN
- 608
- NaN
- NaN
- Institutional archiving locations: Institution...
-
-
- 8592
- 8593
- 3
- 580.0
- 0
- True
- NaN
- 608
- NaN
- NaN
- Institutional archiving locations: Institution...
-
-
- 8593
- 8594
- 3
- 580.0
- 0
- True
- 1.0
- 608
- NaN
- NaN
- Institutional archiving locations: Any Repository
-
-
- 8594
- 8595
- 3
- 580.0
- 0
- True
- 1.0
- 608
- NaN
- NaN
- Institutional archiving locations: Any Repository
-
-
-
-
8595 rows × 10 columns
-
-
-
-
-
-```python
-term_orig = term_orig.append(term_rp, ignore_index=True, sort=False)
-term_orig
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
-
-
-
-
- 0
- 1.0
- 1
- NaN
- 0
- True
- NaN
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
-
-
- 1
- 2.0
- 2
- NaN
- 12
- True
- NaN
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
-
-
- 2
- 3.0
- 3
- 355.0
- 0
- True
- 1.0
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
-
-
- 3
- 4.0
- 3
- 355.0
- 0
- True
- 2.0
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
-
-
- 4
- 5.0
- 1
- NaN
- 0
- False
- NaN
- 498.0
- NaN
- NaN
- Non institutional archiving locations: ChemRxi...
- NaN
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 48673
- NaN
- 3
- 581.0
- 60
- True
- 5.0
- 592.0
- NaN
- https://ror.org/01swzsf04
- Cambridge University Press (CUP) Read & Publis...
- 40079.0
- 2021-01-01
- 2023-12-31
-
-
- 48674
- NaN
- 3
- 581.0
- 60
- True
- 5.0
- 592.0
- NaN
- https://ror.org/019whta54
- Cambridge University Press (CUP) Read & Publis...
- 40080.0
- 2021-01-01
- 2023-12-31
-
-
- 48675
- NaN
- 3
- 581.0
- 60
- True
- 5.0
- 592.0
- NaN
- https://ror.org/00vasag41
- Cambridge University Press (CUP) Read & Publis...
- 40081.0
- 2021-01-01
- 2023-12-31
-
-
- 48676
- NaN
- 3
- 581.0
- 60
- True
- 5.0
- 592.0
- NaN
- https://ror.org/05r0ap620
- Cambridge University Press (CUP) Read & Publis...
- 40082.0
- 2021-01-01
- 2023-12-31
-
-
- 48677
- NaN
- 3
- 581.0
- 60
- True
- 5.0
- 592.0
- NaN
- https://ror.org/05pmsvm27
- Cambridge University Press (CUP) Read & Publis...
- 40083.0
- 2021-01-01
- 2023-12-31
-
-
-
-
48678 rows × 13 columns
-
-
-
-
-
-```python
-# ajout d'un hash unique pour chaque variante
-term_orig['id_content_hash'] = term_orig.apply(lambda x: hash(tuple(x[['version', 'cost_factor', 'embargo_months', 'archiving', 'comment']])), axis = 1)
-term_orig['id_content_hash_licence'] = term_orig.apply(lambda x: hash(tuple(x[['version', 'cost_factor', 'embargo_months', 'archiving', 'licence', 'comment']])), axis = 1)
-```
-
-
-```python
-term_orig.sort_values(by='id_content_hash')
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
- id_content_hash
- id_content_hash_licence
-
-
-
-
- 6599
- 6600.0
- 2
- NaN
- 12
- True
- NaN
- 923.0
- True
- https://ror.org/056y81r79
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -9213354388875732238
- -5975042390572407328
-
-
- 6867
- 6868.0
- 2
- NaN
- 12
- True
- NaN
- 957.0
- True
- https://ror.org/056bwcz71
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -9213354388875732238
- -5975042390572407328
-
-
- 4750
- 4751.0
- 2
- NaN
- 12
- True
- NaN
- 642.0
- True
- https://ror.org/05w9mt194
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -9213354388875732238
- -5975042390572407328
-
-
- 8236
- 8237.0
- 2
- NaN
- 12
- True
- NaN
- 640.0
- True
- https://ror.org/02wxr8x18
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -9213354388875732238
- -5975042390572407328
-
-
- 8237
- 8238.0
- 2
- NaN
- 12
- True
- NaN
- 640.0
- True
- https://ror.org/056y81r79
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -9213354388875732238
- -5975042390572407328
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 6353
- 6354.0
- 3
- 222.0
- 0
- True
- 1.0
- 190.0
- True
- https://ror.org/02wdwnk04
- Institutional archiving locations: Institution...
- NaN
- NaN
- NaN
- 9219045216097074691
- -8427874628140339220
-
-
- 6352
- 6353.0
- 3
- 222.0
- 0
- True
- 1.0
- 190.0
- True
- https://ror.org/029chgv08
- Institutional archiving locations: Institution...
- NaN
- NaN
- NaN
- 9219045216097074691
- -8427874628140339220
-
-
- 6362
- 6363.0
- 3
- 222.0
- 0
- True
- 1.0
- 190.0
- True
- https://ror.org/0472cxd90
- Institutional archiving locations: Institution...
- NaN
- NaN
- NaN
- 9219045216097074691
- -8427874628140339220
-
-
- 6357
- 6358.0
- 3
- 222.0
- 0
- True
- 1.0
- 190.0
- True
- https://ror.org/0456r8d26
- Institutional archiving locations: Institution...
- NaN
- NaN
- NaN
- 9219045216097074691
- -8427874628140339220
-
-
- 6363
- 6364.0
- 3
- 222.0
- 0
- True
- 1.0
- 190.0
- True
- https://ror.org/03x94j517
- Institutional archiving locations: Institution...
- NaN
- NaN
- NaN
- 9219045216097074691
- -8427874628140339220
-
-
-
-
48678 rows × 15 columns
-
-
-
-
-
-```python
-# doublons
-term_orig.loc[term_orig.duplicated(subset='id_content_hash')].sort_values(by='id_content_hash')
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
- id_content_hash
- id_content_hash_licence
-
-
-
-
- 6607
- 6608.0
- 2
- NaN
- 12
- True
- NaN
- 175.0
- True
- https://ror.org/02wxr8x18
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -9213354388875732238
- -5975042390572407328
-
-
- 6508
- 6509.0
- 2
- NaN
- 12
- True
- NaN
- 64.0
- True
- https://ror.org/05w9mt194
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -9213354388875732238
- -5975042390572407328
-
-
- 1294
- 1295.0
- 2
- NaN
- 12
- True
- NaN
- 342.0
- True
- https://ror.org/056bwcz71
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -9213354388875732238
- -5975042390572407328
-
-
- 5561
- 5562.0
- 2
- NaN
- 12
- True
- NaN
- 27.0
- True
- https://ror.org/05w9mt194
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -9213354388875732238
- -5975042390572407328
-
-
- 5559
- 5560.0
- 2
- NaN
- 12
- True
- NaN
- 27.0
- True
- https://ror.org/056y81r79
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -9213354388875732238
- -5975042390572407328
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 6355
- 6356.0
- 3
- 222.0
- 0
- True
- 1.0
- 190.0
- True
- https://ror.org/00cwqg982
- Institutional archiving locations: Institution...
- NaN
- NaN
- NaN
- 9219045216097074691
- -8427874628140339220
-
-
- 6354
- 6355.0
- 3
- 222.0
- 0
- True
- 1.0
- 190.0
- True
- https://ror.org/02jkpm469
- Institutional archiving locations: Institution...
- NaN
- NaN
- NaN
- 9219045216097074691
- -8427874628140339220
-
-
- 6353
- 6354.0
- 3
- 222.0
- 0
- True
- 1.0
- 190.0
- True
- https://ror.org/02wdwnk04
- Institutional archiving locations: Institution...
- NaN
- NaN
- NaN
- 9219045216097074691
- -8427874628140339220
-
-
- 6364
- 6365.0
- 3
- 222.0
- 0
- True
- 1.0
- 190.0
- True
- https://ror.org/02gq0fg61
- Institutional archiving locations: Institution...
- NaN
- NaN
- NaN
- 9219045216097074691
- -8427874628140339220
-
-
- 6359
- 6360.0
- 3
- 222.0
- 0
- True
- 1.0
- 190.0
- True
- https://ror.org/01613vh25
- Institutional archiving locations: Institution...
- NaN
- NaN
- NaN
- 9219045216097074691
- -8427874628140339220
-
-
-
-
47358 rows × 15 columns
-
-
-
-
-
-```python
-term_orig['licence'] = term_orig['licence'].fillna(999999)
-term_orig['licence'] = term_orig['licence'].astype(int)
-term_orig['cost_factor'] = term_orig['cost_factor'].fillna(999999)
-term_orig['cost_factor'] = term_orig['cost_factor'].astype(int)
-# term_orig['embargo_months'] = term_orig['embargo_months'].fillna(0)
-# term_orig['embargo_months'] = term_orig['embargo_months'].astype(int)
-term_orig.loc[term_orig['archiving'] == True, 'ir_archiving'] = 1
-term_orig.loc[term_orig['archiving'] == False, 'ir_archiving'] = 0
-term_orig['ir_archiving'] = term_orig['ir_archiving'].fillna(0)
-term_orig
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
- id_content_hash
- id_content_hash_licence
- ir_archiving
-
-
-
-
- 0
- 1.0
- 1
- 999999
- 0
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -5068777248818105392
- -8194612545168817012
- 1.0
-
-
- 1
- 2.0
- 2
- 999999
- 12
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -1187146317861229577
- 1080785657261440835
- 1.0
-
-
- 2
- 3.0
- 3
- 355
- 0
- True
- 1
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- -6827815856646016670
- -4410614044147247907
- 1.0
-
-
- 3
- 4.0
- 3
- 355
- 0
- True
- 2
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- 5388365857945903435
- -492868609330074007
- 1.0
-
-
- 4
- 5.0
- 1
- 999999
- 0
- False
- 999999
- 498.0
- NaN
- NaN
- Non institutional archiving locations: ChemRxi...
- NaN
- NaN
- NaN
- -2781821769548802966
- 935766765288137110
- 0.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 48673
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/01swzsf04
- Cambridge University Press (CUP) Read & Publis...
- 40079.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1.0
-
-
- 48674
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/019whta54
- Cambridge University Press (CUP) Read & Publis...
- 40080.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1.0
-
-
- 48675
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/00vasag41
- Cambridge University Press (CUP) Read & Publis...
- 40081.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1.0
-
-
- 48676
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05r0ap620
- Cambridge University Press (CUP) Read & Publis...
- 40082.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1.0
-
-
- 48677
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05pmsvm27
- Cambridge University Press (CUP) Read & Publis...
- 40083.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1.0
-
-
-
-
48678 rows × 16 columns
-
-
-
-
-
-```python
-term_orig.loc[term_orig['ir_archiving'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
- id_content_hash
- id_content_hash_licence
- ir_archiving
-
-
-
-
-
-
-
-
-
-
-```python
-term_orig['ir_archiving'].value_counts()
-```
-
-
-
-
- 1.0 47467
- 0.0 1211
- Name: ir_archiving, dtype: int64
-
-
-
-
-```python
-term_orig['licence'] = term_orig['licence'].astype(int)
-term_orig['ir_archiving'] = term_orig['ir_archiving'].astype(int)
-term_orig['cost_factor'] = term_orig['cost_factor'].astype(int)
-term_orig
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
- id_content_hash
- id_content_hash_licence
- ir_archiving
-
-
-
-
- 0
- 1.0
- 1
- 999999
- 0
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
-
-
- 1
- 2.0
- 2
- 999999
- 12
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -1187146317861229577
- 1080785657261440835
- 1
-
-
- 2
- 3.0
- 3
- 355
- 0
- True
- 1
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- -6827815856646016670
- -4410614044147247907
- 1
-
-
- 3
- 4.0
- 3
- 355
- 0
- True
- 2
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- 5388365857945903435
- -492868609330074007
- 1
-
-
- 4
- 5.0
- 1
- 999999
- 0
- False
- 999999
- 498.0
- NaN
- NaN
- Non institutional archiving locations: ChemRxi...
- NaN
- NaN
- NaN
- -2781821769548802966
- 935766765288137110
- 0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 48673
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/01swzsf04
- Cambridge University Press (CUP) Read & Publis...
- 40079.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
-
-
- 48674
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/019whta54
- Cambridge University Press (CUP) Read & Publis...
- 40080.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
-
-
- 48675
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/00vasag41
- Cambridge University Press (CUP) Read & Publis...
- 40081.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
-
-
- 48676
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05r0ap620
- Cambridge University Press (CUP) Read & Publis...
- 40082.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
-
-
- 48677
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05pmsvm27
- Cambridge University Press (CUP) Read & Publis...
- 40083.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
-
-
-
-
48678 rows × 16 columns
-
-
-
-
-
-```python
-terms_export_dates = term_orig.loc[(term_orig['valid_from'].notna()) | (term_orig['valid_until'].notna())][['id_content_hash', 'ror', 'valid_from', 'valid_until']]
-terms_export_dates
-```
-
-
-
-
-
-
-
-
-
-
- id_content_hash
- ror
- valid_from
- valid_until
-
-
-
-
- 8595
- -6020029623494903364
- https://ror.org/04d8ztx87
- 2020-01-01
- 2023-12-31
-
-
- 8596
- -6020029623494903364
- https://ror.org/02bnkt322
- 2020-01-01
- 2023-12-31
-
-
- 8597
- -6020029623494903364
- https://ror.org/00zg4za48
- 2020-01-01
- 2023-12-31
-
-
- 8598
- -6020029623494903364
- https://ror.org/02s376052
- 2020-01-01
- 2023-12-31
-
-
- 8599
- -6020029623494903364
- https://ror.org/05a28rw58
- 2020-01-01
- 2023-12-31
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 48673
- 7687377827846095855
- https://ror.org/01swzsf04
- 2021-01-01
- 2023-12-31
-
-
- 48674
- 7687377827846095855
- https://ror.org/019whta54
- 2021-01-01
- 2023-12-31
-
-
- 48675
- 7687377827846095855
- https://ror.org/00vasag41
- 2021-01-01
- 2023-12-31
-
-
- 48676
- 7687377827846095855
- https://ror.org/05r0ap620
- 2021-01-01
- 2023-12-31
-
-
- 48677
- 7687377827846095855
- https://ror.org/05pmsvm27
- 2021-01-01
- 2023-12-31
-
-
-
-
40083 rows × 4 columns
-
-
-
-
-
-```python
-terms_export = term_orig[['id_sherpa', 'rp_id', 'id_content_hash', 'id_content_hash_licence', 'version', 'cost_factor', 'embargo_months', 'ir_archiving', 'licence', 'comment']]
-terms_export
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
-
-
-
-
- 0
- 1.0
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 999999
- 0
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 1
- 2.0
- NaN
- -1187146317861229577
- 1080785657261440835
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 2
- 3.0
- NaN
- -6827815856646016670
- -4410614044147247907
- 3
- 355
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
-
-
- 3
- 4.0
- NaN
- 5388365857945903435
- -492868609330074007
- 3
- 355
- 0
- 1
- 2
- Institutional archiving locations: Any Website...
-
-
- 4
- 5.0
- NaN
- -2781821769548802966
- 935766765288137110
- 1
- 999999
- 0
- 0
- 999999
- Non institutional archiving locations: ChemRxi...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 48673
- NaN
- 40079.0
- 7687377827846095855
- 2298495942200956358
- 3
- 581
- 60
- 1
- 5
- Cambridge University Press (CUP) Read & Publis...
-
-
- 48674
- NaN
- 40080.0
- 7687377827846095855
- 2298495942200956358
- 3
- 581
- 60
- 1
- 5
- Cambridge University Press (CUP) Read & Publis...
-
-
- 48675
- NaN
- 40081.0
- 7687377827846095855
- 2298495942200956358
- 3
- 581
- 60
- 1
- 5
- Cambridge University Press (CUP) Read & Publis...
-
-
- 48676
- NaN
- 40082.0
- 7687377827846095855
- 2298495942200956358
- 3
- 581
- 60
- 1
- 5
- Cambridge University Press (CUP) Read & Publis...
-
-
- 48677
- NaN
- 40083.0
- 7687377827846095855
- 2298495942200956358
- 3
- 581
- 60
- 1
- 5
- Cambridge University Press (CUP) Read & Publis...
-
-
-
-
48678 rows × 10 columns
-
-
-
-
-
-```python
-# test de doublons
-terms_export.loc[terms_export.duplicated(subset='id_content_hash')].sort_values(by='id_content_hash')
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
-
-
-
-
- 6607
- 6608.0
- NaN
- -9213354388875732238
- -5975042390572407328
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 6508
- 6509.0
- NaN
- -9213354388875732238
- -5975042390572407328
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 1294
- 1295.0
- NaN
- -9213354388875732238
- -5975042390572407328
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 5561
- 5562.0
- NaN
- -9213354388875732238
- -5975042390572407328
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 5559
- 5560.0
- NaN
- -9213354388875732238
- -5975042390572407328
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 6355
- 6356.0
- NaN
- 9219045216097074691
- -8427874628140339220
- 3
- 222
- 0
- 1
- 1
- Institutional archiving locations: Institution...
-
-
- 6354
- 6355.0
- NaN
- 9219045216097074691
- -8427874628140339220
- 3
- 222
- 0
- 1
- 1
- Institutional archiving locations: Institution...
-
-
- 6353
- 6354.0
- NaN
- 9219045216097074691
- -8427874628140339220
- 3
- 222
- 0
- 1
- 1
- Institutional archiving locations: Institution...
-
-
- 6364
- 6365.0
- NaN
- 9219045216097074691
- -8427874628140339220
- 3
- 222
- 0
- 1
- 1
- Institutional archiving locations: Institution...
-
-
- 6359
- 6360.0
- NaN
- 9219045216097074691
- -8427874628140339220
- 3
- 222
- 0
- 1
- 1
- Institutional archiving locations: Institution...
-
-
-
-
47358 rows × 10 columns
-
-
-
-
-
-```python
-terms_export_dedup = terms_export.drop_duplicates(subset=['id_content_hash'])
-terms_export_dedup
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
-
-
-
-
- 0
- 1.0
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 999999
- 0
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 1
- 2.0
- NaN
- -1187146317861229577
- 1080785657261440835
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 2
- 3.0
- NaN
- -6827815856646016670
- -4410614044147247907
- 3
- 355
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
-
-
- 3
- 4.0
- NaN
- 5388365857945903435
- -492868609330074007
- 3
- 355
- 0
- 1
- 2
- Institutional archiving locations: Any Website...
-
-
- 4
- 5.0
- NaN
- -2781821769548802966
- 935766765288137110
- 1
- 999999
- 0
- 0
- 999999
- Non institutional archiving locations: ChemRxi...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8595
- NaN
- 1.0
- -6020029623494903364
- -5435886237991661497
- 3
- 581
- 0
- 1
- 1
- Elsevier Read & Publish agreement
-
-
- 26723
- NaN
- 18129.0
- -1955262099488276438
- 6359482801433181261
- 3
- 581
- 0
- 1
- 1
- NaN
-
-
- 33439
- NaN
- 24845.0
- -681455397323083870
- 5265079689140421989
- 3
- 581
- 0
- 1
- 1
- Wiley Read & Publish agreement
-
-
- 47344
- NaN
- 38750.0
- 6747956201225830719
- -4648758608429098534
- 3
- 581
- 0
- 1
- 1
- Taylor and Francis Read & Publish agreement
-
-
- 47758
- NaN
- 39164.0
- 7687377827846095855
- 2298488065455407402
- 3
- 581
- 60
- 1
- 1
- Cambridge University Press (CUP) Read & Publis...
-
-
-
-
1320 rows × 10 columns
-
-
-
-
-
-```python
-terms_export_dedup_licence = terms_export.drop_duplicates(subset=['id_content_hash_licence'])
-terms_export_dedup_licence
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
-
-
-
-
- 0
- 1.0
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 999999
- 0
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 1
- 2.0
- NaN
- -1187146317861229577
- 1080785657261440835
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 2
- 3.0
- NaN
- -6827815856646016670
- -4410614044147247907
- 3
- 355
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
-
-
- 3
- 4.0
- NaN
- 5388365857945903435
- -492868609330074007
- 3
- 355
- 0
- 1
- 2
- Institutional archiving locations: Any Website...
-
-
- 4
- 5.0
- NaN
- -2781821769548802966
- 935766765288137110
- 1
- 999999
- 0
- 0
- 999999
- Non institutional archiving locations: ChemRxi...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 47344
- NaN
- 38750.0
- 6747956201225830719
- -4648758608429098534
- 3
- 581
- 0
- 1
- 1
- Taylor and Francis Read & Publish agreement
-
-
- 47758
- NaN
- 39164.0
- 7687377827846095855
- 2298488065455407402
- 3
- 581
- 60
- 1
- 1
- Cambridge University Press (CUP) Read & Publis...
-
-
- 47988
- NaN
- 39394.0
- 7687377827846095855
- 2298497766188448059
- 3
- 581
- 60
- 1
- 4
- Cambridge University Press (CUP) Read & Publis...
-
-
- 48218
- NaN
- 39624.0
- 7687377827846095855
- 2298486079450211665
- 3
- 581
- 60
- 1
- 2
- Cambridge University Press (CUP) Read & Publis...
-
-
- 48448
- NaN
- 39854.0
- 7687377827846095855
- 2298495942200956358
- 3
- 581
- 60
- 1
- 5
- Cambridge University Press (CUP) Read & Publis...
-
-
-
-
1590 rows × 10 columns
-
-
-
-
-
-```python
-# test de doublons
-terms_export_dedup_licence.loc[terms_export_dedup_licence.duplicated(subset='id_content_hash')].sort_values(by='id_content_hash')
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
-
-
-
-
- 1569
- 1570.0
- NaN
- -9114006443623277513
- -7273388776362060491
- 3
- 413
- 0
- 0
- 2
- Non institutional archiving locations: PubMed ...
-
-
- 582
- 583.0
- NaN
- -9011072484834895623
- -5911605112402338889
- 3
- 379
- 0
- 1
- 2
- Institutional archiving locations: Any Reposit...
-
-
- 8553
- 8554.0
- NaN
- -8861630054613228454
- 7176773088076624015
- 3
- 573
- 0
- 0
- 3
- Non institutional archiving locations: Funder ...
-
-
- 8552
- 8553.0
- NaN
- -8861630054613228454
- 7176773474396433690
- 3
- 573
- 0
- 0
- 2
- Non institutional archiving locations: Funder ...
-
-
- 8264
- 8265.0
- NaN
- -8856152899298491735
- -1219996111910161561
- 3
- 560
- 0
- 1
- 4
- Institutional archiving locations: Non-Commerc...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8560
- 8561.0
- NaN
- 8735446932641542951
- 4332046250364995695
- 3
- 574
- 0
- 0
- 2
- Non institutional archiving locations: Funder ...
-
-
- 8561
- 8562.0
- NaN
- 8735446932641542951
- 4332048117937865978
- 3
- 574
- 0
- 0
- 3
- Non institutional archiving locations: Funder ...
-
-
- 2222
- 2223.0
- NaN
- 8745253383893524719
- 521134702989893722
- 3
- 431
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
-
-
- 4152
- 4153.0
- NaN
- 8845243756736955098
- 6100113456095422831
- 3
- 464
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
-
-
- 4351
- 4352.0
- NaN
- 9036026380223066491
- -1539490241665655036
- 3
- 470
- 0
- 1
- 1
- Institutional archiving locations: Institution...
-
-
-
-
270 rows × 10 columns
-
-
-
-
-
-```python
-# totaux pour les deux sources
-terms_export_dedup.loc[terms_export_dedup['id_sherpa'].notna()].shape[0]
-```
-
-
-
-
- 1315
-
-
-
-
-```python
-terms_export_dedup.loc[terms_export_dedup['rp_id'].notna()].shape[0]
-```
-
-
-
-
- 5
-
-
-
-
-```python
-terms_export_dedup.loc[terms_export_dedup['rp_id'].notna()]
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
-
-
-
-
- 8595
- NaN
- 1.0
- -6020029623494903364
- -5435886237991661497
- 3
- 581
- 0
- 1
- 1
- Elsevier Read & Publish agreement
-
-
- 26723
- NaN
- 18129.0
- -1955262099488276438
- 6359482801433181261
- 3
- 581
- 0
- 1
- 1
- NaN
-
-
- 33439
- NaN
- 24845.0
- -681455397323083870
- 5265079689140421989
- 3
- 581
- 0
- 1
- 1
- Wiley Read & Publish agreement
-
-
- 47344
- NaN
- 38750.0
- 6747956201225830719
- -4648758608429098534
- 3
- 581
- 0
- 1
- 1
- Taylor and Francis Read & Publish agreement
-
-
- 47758
- NaN
- 39164.0
- 7687377827846095855
- 2298488065455407402
- 3
- 581
- 60
- 1
- 1
- Cambridge University Press (CUP) Read & Publis...
-
-
-
-
-
-
-
-
-```python
-# convertir l'index en id
-terms_export_dedup.reset_index(inplace=True)
-del terms_export_dedup['index']
-terms_export_dedup
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
-
-
-
-
- 0
- 1.0
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 999999
- 0
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 1
- 2.0
- NaN
- -1187146317861229577
- 1080785657261440835
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
-
-
- 2
- 3.0
- NaN
- -6827815856646016670
- -4410614044147247907
- 3
- 355
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
-
-
- 3
- 4.0
- NaN
- 5388365857945903435
- -492868609330074007
- 3
- 355
- 0
- 1
- 2
- Institutional archiving locations: Any Website...
-
-
- 4
- 5.0
- NaN
- -2781821769548802966
- 935766765288137110
- 1
- 999999
- 0
- 0
- 999999
- Non institutional archiving locations: ChemRxi...
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1315
- NaN
- 1.0
- -6020029623494903364
- -5435886237991661497
- 3
- 581
- 0
- 1
- 1
- Elsevier Read & Publish agreement
-
-
- 1316
- NaN
- 18129.0
- -1955262099488276438
- 6359482801433181261
- 3
- 581
- 0
- 1
- 1
- NaN
-
-
- 1317
- NaN
- 24845.0
- -681455397323083870
- 5265079689140421989
- 3
- 581
- 0
- 1
- 1
- Wiley Read & Publish agreement
-
-
- 1318
- NaN
- 38750.0
- 6747956201225830719
- -4648758608429098534
- 3
- 581
- 0
- 1
- 1
- Taylor and Francis Read & Publish agreement
-
-
- 1319
- NaN
- 39164.0
- 7687377827846095855
- 2298488065455407402
- 3
- 581
- 60
- 1
- 1
- Cambridge University Press (CUP) Read & Publis...
-
-
-
-
1320 rows × 10 columns
-
-
-
-
-
-```python
-# ajout de l'id avec l'index + 1
-terms_export_dedup['id'] = terms_export_dedup.index + 1
-# del terms_export_dedup['index']
-terms_export_dedup
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
-
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
- id
-
-
-
-
- 0
- 1.0
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 999999
- 0
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 1
-
-
- 1
- 2.0
- NaN
- -1187146317861229577
- 1080785657261440835
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 2
-
-
- 2
- 3.0
- NaN
- -6827815856646016670
- -4410614044147247907
- 3
- 355
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
- 3
-
-
- 3
- 4.0
- NaN
- 5388365857945903435
- -492868609330074007
- 3
- 355
- 0
- 1
- 2
- Institutional archiving locations: Any Website...
- 4
-
-
- 4
- 5.0
- NaN
- -2781821769548802966
- 935766765288137110
- 1
- 999999
- 0
- 0
- 999999
- Non institutional archiving locations: ChemRxi...
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1315
- NaN
- 1.0
- -6020029623494903364
- -5435886237991661497
- 3
- 581
- 0
- 1
- 1
- Elsevier Read & Publish agreement
- 1316
-
-
- 1316
- NaN
- 18129.0
- -1955262099488276438
- 6359482801433181261
- 3
- 581
- 0
- 1
- 1
- NaN
- 1317
-
-
- 1317
- NaN
- 24845.0
- -681455397323083870
- 5265079689140421989
- 3
- 581
- 0
- 1
- 1
- Wiley Read & Publish agreement
- 1318
-
-
- 1318
- NaN
- 38750.0
- 6747956201225830719
- -4648758608429098534
- 3
- 581
- 0
- 1
- 1
- Taylor and Francis Read & Publish agreement
- 1319
-
-
- 1319
- NaN
- 39164.0
- 7687377827846095855
- 2298488065455407402
- 3
- 581
- 60
- 1
- 1
- Cambridge University Press (CUP) Read & Publis...
- 1320
-
-
-
-
1320 rows × 11 columns
-
-
-
-
-
-```python
-terms_export_dedup['source'] = ''
-terms_export_dedup
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- """Entry point for launching an IPython kernel.
-
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
- id
- source
-
-
-
-
- 0
- 1.0
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 999999
- 0
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 1
-
-
-
- 1
- 2.0
- NaN
- -1187146317861229577
- 1080785657261440835
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 2
-
-
-
- 2
- 3.0
- NaN
- -6827815856646016670
- -4410614044147247907
- 3
- 355
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
- 3
-
-
-
- 3
- 4.0
- NaN
- 5388365857945903435
- -492868609330074007
- 3
- 355
- 0
- 1
- 2
- Institutional archiving locations: Any Website...
- 4
-
-
-
- 4
- 5.0
- NaN
- -2781821769548802966
- 935766765288137110
- 1
- 999999
- 0
- 0
- 999999
- Non institutional archiving locations: ChemRxi...
- 5
-
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1315
- NaN
- 1.0
- -6020029623494903364
- -5435886237991661497
- 3
- 581
- 0
- 1
- 1
- Elsevier Read & Publish agreement
- 1316
-
-
-
- 1316
- NaN
- 18129.0
- -1955262099488276438
- 6359482801433181261
- 3
- 581
- 0
- 1
- 1
- NaN
- 1317
-
-
-
- 1317
- NaN
- 24845.0
- -681455397323083870
- 5265079689140421989
- 3
- 581
- 0
- 1
- 1
- Wiley Read & Publish agreement
- 1318
-
-
-
- 1318
- NaN
- 38750.0
- 6747956201225830719
- -4648758608429098534
- 3
- 581
- 0
- 1
- 1
- Taylor and Francis Read & Publish agreement
- 1319
-
-
-
- 1319
- NaN
- 39164.0
- 7687377827846095855
- 2298488065455407402
- 3
- 581
- 60
- 1
- 1
- Cambridge University Press (CUP) Read & Publis...
- 1320
-
-
-
-
-
1320 rows × 12 columns
-
-
-
-
-
-```python
-# grouper par licence
-terms_export_dedup_licences = terms_export_dedup_licence[['licence', 'id_content_hash']]
-terms_export_dedup_licences
-```
-
-
-
-
-
-
-
-
-
-
- licence
- id_content_hash
-
-
-
-
- 0
- 999999
- -5068777248818105392
-
-
- 1
- 999999
- -1187146317861229577
-
-
- 2
- 1
- -6827815856646016670
-
-
- 3
- 2
- 5388365857945903435
-
-
- 4
- 999999
- -2781821769548802966
-
-
- ...
- ...
- ...
-
-
- 47344
- 1
- 6747956201225830719
-
-
- 47758
- 1
- 7687377827846095855
-
-
- 47988
- 4
- 7687377827846095855
-
-
- 48218
- 2
- 7687377827846095855
-
-
- 48448
- 5
- 7687377827846095855
-
-
-
-
1590 rows × 2 columns
-
-
-
-
-
-```python
-# concat valeurs avec même id
-terms_export_dedup_licences['licence'] = terms_export_dedup_licences['licence'].astype(str)
-terms_export_dedup_licences = terms_export_dedup_licences.groupby('id_content_hash').agg({'licence': lambda x: ', '.join(x)})
-terms_export_dedup_licences
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
-
-
-
-
-
-
-
-
-
-
-
- licence
-
-
- id_content_hash
-
-
-
-
-
- -9213354388875732238
- 999999
-
-
- -9200070744422558377
- 999999
-
-
- -9171783117023104395
- 1
-
-
- -9134952646468948163
- 1
-
-
- -9133013648751406289
- 1
-
-
- ...
- ...
-
-
- 9195001330432352893
- 1
-
-
- 9200466168345981543
- 1
-
-
- 9213878808178729253
- 2
-
-
- 9218389208912777882
- 2
-
-
- 9219045216097074691
- 1
-
-
-
-
1320 rows × 1 columns
-
-
-
-
-
-```python
-# test des valeur multiples
-terms_export_dedup_licences.loc[terms_export_dedup_licences['licence'].str.contains(',')]
-```
-
-
-
-
-
-
-
-
-
-
- licence
-
-
- id_content_hash
-
-
-
-
-
- -9114006443623277513
- 1, 2
-
-
- -9011072484834895623
- 1, 2
-
-
- -8861630054613228454
- 1, 2, 3
-
-
- -8856152899298491735
- 1, 4
-
-
- -8607167568720519189
- 1, 4
-
-
- ...
- ...
-
-
- 8712161777436385390
- 1, 4
-
-
- 8735446932641542951
- 1, 2, 3
-
-
- 8745253383893524719
- 2, 1
-
-
- 8845243756736955098
- 2, 1
-
-
- 9036026380223066491
- 2, 1
-
-
-
-
185 rows × 1 columns
-
-
-
-
-
-```python
-# ajout des licences groupées
-terms_export_dedup_fin = pd.merge(terms_export_dedup, terms_export_dedup_licences, on='id_content_hash', how='left')
-terms_export_dedup_fin
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence_x
- comment
- id
- source
- licence_y
-
-
-
-
- 0
- 1.0
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 999999
- 0
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 1
-
- 999999
-
-
- 1
- 2.0
- NaN
- -1187146317861229577
- 1080785657261440835
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 2
-
- 999999
-
-
- 2
- 3.0
- NaN
- -6827815856646016670
- -4410614044147247907
- 3
- 355
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
- 3
-
- 1
-
-
- 3
- 4.0
- NaN
- 5388365857945903435
- -492868609330074007
- 3
- 355
- 0
- 1
- 2
- Institutional archiving locations: Any Website...
- 4
-
- 2
-
-
- 4
- 5.0
- NaN
- -2781821769548802966
- 935766765288137110
- 1
- 999999
- 0
- 0
- 999999
- Non institutional archiving locations: ChemRxi...
- 5
-
- 999999
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1315
- NaN
- 1.0
- -6020029623494903364
- -5435886237991661497
- 3
- 581
- 0
- 1
- 1
- Elsevier Read & Publish agreement
- 1316
-
- 1, 2
-
-
- 1316
- NaN
- 18129.0
- -1955262099488276438
- 6359482801433181261
- 3
- 581
- 0
- 1
- 1
- NaN
- 1317
-
- 1, 4
-
-
- 1317
- NaN
- 24845.0
- -681455397323083870
- 5265079689140421989
- 3
- 581
- 0
- 1
- 1
- Wiley Read & Publish agreement
- 1318
-
- 1, 4, 2
-
-
- 1318
- NaN
- 38750.0
- 6747956201225830719
- -4648758608429098534
- 3
- 581
- 0
- 1
- 1
- Taylor and Francis Read & Publish agreement
- 1319
-
- 1
-
-
- 1319
- NaN
- 39164.0
- 7687377827846095855
- 2298488065455407402
- 3
- 581
- 60
- 1
- 1
- Cambridge University Press (CUP) Read & Publis...
- 1320
-
- 1, 4, 2, 5
-
-
-
-
1320 rows × 13 columns
-
-
-
-
-
-```python
-# merge avec les dates pour avoir les terms ids
-terms_export_dates = pd.merge(terms_export_dates, terms_export_dedup_fin[['id_content_hash', 'id']], on='id_content_hash')
-terms_export_dates = terms_export_dates.rename(columns = {'id' : 'term'})
-terms_export_dates
-```
-
-
-
-
-
-
-
-
-
-
- id_content_hash
- ror
- valid_from
- valid_until
- term
-
-
-
-
- 0
- -6020029623494903364
- https://ror.org/04d8ztx87
- 2020-01-01
- 2023-12-31
- 1316
-
-
- 1
- -6020029623494903364
- https://ror.org/02bnkt322
- 2020-01-01
- 2023-12-31
- 1316
-
-
- 2
- -6020029623494903364
- https://ror.org/00zg4za48
- 2020-01-01
- 2023-12-31
- 1316
-
-
- 3
- -6020029623494903364
- https://ror.org/02s376052
- 2020-01-01
- 2023-12-31
- 1316
-
-
- 4
- -6020029623494903364
- https://ror.org/05a28rw58
- 2020-01-01
- 2023-12-31
- 1316
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 7687377827846095855
- https://ror.org/01swzsf04
- 2021-01-01
- 2023-12-31
- 1320
-
-
- 40079
- 7687377827846095855
- https://ror.org/019whta54
- 2021-01-01
- 2023-12-31
- 1320
-
-
- 40080
- 7687377827846095855
- https://ror.org/00vasag41
- 2021-01-01
- 2023-12-31
- 1320
-
-
- 40081
- 7687377827846095855
- https://ror.org/05r0ap620
- 2021-01-01
- 2023-12-31
- 1320
-
-
- 40082
- 7687377827846095855
- https://ror.org/05pmsvm27
- 2021-01-01
- 2023-12-31
- 1320
-
-
-
-
40083 rows × 5 columns
-
-
-
-
-
-```python
-# renommer les champs de licence
-del terms_export_dedup_fin['licence_x']
-terms_export_dedup_fin = terms_export_dedup_fin.rename(columns = {'licence_y' : 'licence'})
-```
-
-
-```python
-terms_export_fin = terms_export_dedup_fin[['version', 'cost_factor', 'embargo_months', 'ir_archiving', 'licence', 'comment', 'id', 'source']]
-terms_export_fin
-```
-
-
-
-
-
-
-
-
-
-
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
- id
- source
-
-
-
-
- 0
- 1
- 999999
- 0
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 1
-
-
-
- 1
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 2
-
-
-
- 2
- 3
- 355
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
- 3
-
-
-
- 3
- 3
- 355
- 0
- 1
- 2
- Institutional archiving locations: Any Website...
- 4
-
-
-
- 4
- 1
- 999999
- 0
- 0
- 999999
- Non institutional archiving locations: ChemRxi...
- 5
-
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1315
- 3
- 581
- 0
- 1
- 1, 2
- Elsevier Read & Publish agreement
- 1316
-
-
-
- 1316
- 3
- 581
- 0
- 1
- 1, 4
- NaN
- 1317
-
-
-
- 1317
- 3
- 581
- 0
- 1
- 1, 4, 2
- Wiley Read & Publish agreement
- 1318
-
-
-
- 1318
- 3
- 581
- 0
- 1
- 1
- Taylor and Francis Read & Publish agreement
- 1319
-
-
-
- 1319
- 3
- 581
- 60
- 1
- 1, 4, 2, 5
- Cambridge University Press (CUP) Read & Publis...
- 1320
-
-
-
-
-
1320 rows × 8 columns
-
-
-
-
-
-```python
-# export de la table
-result = terms_export_fin.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/term.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-terms_export_fin.to_csv('sample/term.tsv', index=False)
-```
-
-
-```python
-# export excel
-terms_export_fin.to_excel('sample/term.xlsx', index=False)
-```
-
-## Table condition_type
-
-
-```python
-# Journal-only, Organization-only, Journal-organization agreement
-col_names = ['id',
- 'condition_issuer'
- ]
-condition_type = pd.DataFrame(columns = col_names)
-condition_type = condition_type.append({'id' : 1, 'condition_issuer' : 'Journal-only'}, ignore_index=True)
-condition_type = condition_type.append({'id' : 2, 'condition_issuer' : 'Organization-only'}, ignore_index=True)
-condition_type = condition_type.append({'id' : 3, 'condition_issuer' : 'Journal-organization agreement'}, ignore_index=True)
-condition_type
-```
-
-
-
-
-
-
-
-
-
-
- id
- condition_issuer
-
-
-
-
- 0
- 1
- Journal-only
-
-
- 1
- 2
- Organization-only
-
-
- 2
- 3
- Journal-organization agreement
-
-
-
-
-
-
-
-
-```python
-# export de la table
-result = condition_type.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/condition_type.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export csv
-condition_type.to_csv('sample/condition_type.tsv', index=False)
-```
-
-
-```python
-# export excel
-condition_type.to_excel('sample/condition_type.xlsx', index=False)
-```
-
-## Table organization
-
-
-```python
-# extraction des organizations (funders)
-sherpa
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- sherpa_code
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
- issnl
- version
- licence
- cost_factor
-
-
-
-
- 0
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Must acknowledge acceptance for publication ; ...
- NaN
- 1
- 0001-2815
- 1
- NaN
- NaN
-
-
- 1
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/2050
- no
- no
- accepted
- NaN
- 12
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; named_repository ; non_comm...
- Non-Commercial Institutional Repository
- Author's Homepage ; arXiv ; AgEcon ; PhilPaper...
- arXiv ; AgEcon ; PhilPapers ; PubMed Central ;...
- NaN
- NaN
- NaN
- True
- Publisher source must be acknowledged with cit...
- NaN
- 2
- 0001-2815
- 2
- NaN
- NaN
-
-
- 2
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; institutional_repository ; named...
- Any Website ; Institutional Repository
- PubMed Central ; Subject Repository ; Journal ...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 3
- 0001-2815
- 3
- 1.0
- 355.0
-
-
- 3
- 532
- 0001-2815
- 11905
- https://v2.sherpa.ac.uk/id/publisher_policy/3315
- no
- yes
- published
- cc_by_nc_nd
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_website ; named_repository ; non_commercia...
- Any Website ; Non-Commercial Institutional Rep...
- PubMed Central ; Non-Commercial Subject Reposi...
- PubMed Central
- NaN
- authors
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged
- NaN
- 4
- 0001-2815
- 3
- 2.0
- 355.0
-
-
- 4
- 498
- 0001-4842
- 7760
- https://v2.sherpa.ac.uk/id/publisher_policy/4
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- named_repository ; preprint_repository ; subje...
- NaN
- ChemRxiv ; bioRxiv ; arXiv ; Preprint Reposito...
- ChemRxiv ; bioRxiv ; arXiv
- NaN
- NaN
- NaN
- False
- Must not violate ACS ethical Guidelines ; Must...
- NaN
- 5
- 0001-4842
- 1
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8590
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- submitted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8591
- 2475-9953
- 1
- NaN
- NaN
-
-
- 8591
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- accepted
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8592
- 2475-9953
- 2
- NaN
- NaN
-
-
- 8592
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- no
- published
- NaN
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- authors_homepage ; institutional_repository ; ...
- Institutional Repository ; Institutional Website
- Author's Homepage
- NaN
- NaN
- NaN
- NaN
- True
- Must link to published article ; Publisher cop...
- NaN
- 8593
- 2475-9953
- 3
- NaN
- 580.0
-
-
- 8593
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8594
- 2475-9953
- 3
- 1.0
- 580.0
-
-
- 8594
- 608
- 2475-9953
- 33503
- https://v2.sherpa.ac.uk/id/publisher_policy/10
- no
- yes
- published
- cc_by
- 0
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- NaN
- any_repository ; this_journal
- Any Repository
- Journal Website
- NaN
- NaN
- NaN
- NaN
- True
- NaN
- NaN
- 8595
- 2475-9953
- 3
- 1.0
- 580.0
-
-
-
-
8595 rows × 33 columns
-
-
-
-
-
-```python
-sherpa.loc[sherpa['prerequisite_funders'].notna()]
-```
-
-
-
-
-
-
-
-
-
-
- journal
- issn
- sherpa_id
- sherpa_uri
- open_access_prohibited
- additional_oa_fee
- article_version
- sherpa_code
- embargo
- prerequisites
- prerequisite_funders
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
- prerequisite_subjects
- location
- locations_ir
- locations_not_ir
- named_repository
- named_academic_social_network
- copyright_owner
- publisher_deposit
- archiving
- conditions
- public_notes
- id
- issnl
- version
- licence
- cost_factor
-
-
-
-
- 16
- 789
- 0001-4966
- 4049
- https://v2.sherpa.ac.uk/id/publisher_policy/126
- no
- no
- published
- NaN
- 12
- NaN
- True
- National Institutes of Health (NIH)
- http://dx.doi.org/10.13039/100000002
- https://ror.org/01cwqze88
- us
- http://www.nih.gov/
- 9.0
- NaN
- named_repository
- NaN
- PubMed Central
- PubMed Central
- NaN
- NaN
- disciplinary (PubMed Central) ;
- False
- NaN
- NaN
- 17
- 0001-4966
- 3
- NaN
- 357.0
-
-
- 28
- 668
- 0002-0729
- 1334
- https://v2.sherpa.ac.uk/id/publisher_policy/1107
- no
- no
- accepted
- NaN
- 12
- NaN
- True
- National Institutes of Health (NIH)
- http://dx.doi.org/10.13039/100000002
- https://ror.org/01cwqze88
- us
- http://www.nih.gov/
- 9.0
- NaN
- named_repository
- NaN
- PubMed Central
- PubMed Central
- NaN
- NaN
- disciplinary (PubMed Central) ;
- False
- NaN
- NaN
- 29
- 0002-0729
- 2
- NaN
- NaN
-
-
- 58
- 985
- 0002-9343
- 12950
- https://v2.sherpa.ac.uk/id/publisher_policy/3323
- no
- yes
- published
- cc_by
- 0
- NaN
- True
- Wellcome Trust
- http://dx.doi.org/10.13039/100004440
- https://ror.org/029chgv08
- gb
- http://www.wellcome.ac.uk/
- 695.0
- NaN
- institutional_repository ; named_repository ; ...
- Institutional Repository
- PubMed Central ; Research for Development Repo...
- PubMed Central ; Research for Development Repo...
- NaN
- NaN
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged with cit...
- NaN
- 59
- 0002-9343
- 3
- 1.0
- 223.0
-
-
- 59
- 985
- 0002-9343
- 12950
- https://v2.sherpa.ac.uk/id/publisher_policy/3323
- no
- yes
- published
- cc_by
- 0
- NaN
- True
- British Heart Foundation (BHF)
- http://dx.doi.org/10.13039/501100000274
- https://ror.org/02wdwnk04
- gb
- http://www.bhf.org.uk/
- 18.0
- NaN
- institutional_repository ; named_repository ; ...
- Institutional Repository
- PubMed Central ; Research for Development Repo...
- PubMed Central ; Research for Development Repo...
- NaN
- NaN
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged with cit...
- NaN
- 60
- 0002-9343
- 3
- 1.0
- 223.0
-
-
- 60
- 985
- 0002-9343
- 12950
- https://v2.sherpa.ac.uk/id/publisher_policy/3323
- no
- yes
- published
- cc_by
- 0
- NaN
- True
- Versus Arthritis
- http://dx.doi.org/10.13039/501100000341
- https://ror.org/02jkpm469
- gb
- https://www.versusarthritis.org/
- 14.0
- NaN
- institutional_repository ; named_repository ; ...
- Institutional Repository
- PubMed Central ; Research for Development Repo...
- PubMed Central ; Research for Development Repo...
- NaN
- NaN
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged with cit...
- NaN
- 61
- 0002-9343
- 3
- 1.0
- 223.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8510
- 990
- 2211-2855
- 20490
- https://v2.sherpa.ac.uk/id/publisher_policy/3323
- no
- yes
- published
- cc_by
- 0
- NaN
- True
- European Research Council (ERC)
- http://dx.doi.org/10.13039/501100000781
- https://ror.org/0472cxd90
- be
- http://erc.europa.eu/
- 31.0
- NaN
- institutional_repository ; named_repository ; ...
- Institutional Repository
- PubMed Central ; Research for Development Repo...
- PubMed Central ; Research for Development Repo...
- NaN
- NaN
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged with cit...
- NaN
- 8511
- 2211-2855
- 3
- 1.0
- 352.0
-
-
- 8511
- 990
- 2211-2855
- 20490
- https://v2.sherpa.ac.uk/id/publisher_policy/3323
- no
- yes
- published
- cc_by
- 0
- NaN
- True
- Medical Research Council (MRC)
- http://dx.doi.org/10.13039/501100000265
- https://ror.org/03x94j517
- gb
- http://www.mrc.ac.uk/index.htm
- 705.0
- NaN
- institutional_repository ; named_repository ; ...
- Institutional Repository
- PubMed Central ; Research for Development Repo...
- PubMed Central ; Research for Development Repo...
- NaN
- NaN
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged with cit...
- NaN
- 8512
- 2211-2855
- 3
- 1.0
- 352.0
-
-
- 8512
- 990
- 2211-2855
- 20490
- https://v2.sherpa.ac.uk/id/publisher_policy/3323
- no
- yes
- published
- cc_by
- 0
- NaN
- True
- Motor Neuron Disease Association (MND Associat...
- http://dx.doi.org/10.13039/501100000406
- https://ror.org/02gq0fg61
- gb
- http://www.mndassociation.org/
- 562.0
- NaN
- institutional_repository ; named_repository ; ...
- Institutional Repository
- PubMed Central ; Research for Development Repo...
- PubMed Central ; Research for Development Repo...
- NaN
- NaN
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged with cit...
- NaN
- 8513
- 2211-2855
- 3
- 1.0
- 352.0
-
-
- 8513
- 990
- 2211-2855
- 20490
- https://v2.sherpa.ac.uk/id/publisher_policy/3323
- no
- yes
- published
- cc_by
- 0
- NaN
- True
- Parkinson's UK
- http://dx.doi.org/10.13039/501100000304
- https://ror.org/02417p338
- gb
- http://www.parkinsons.org.uk/
- 411.0
- NaN
- institutional_repository ; named_repository ; ...
- Institutional Repository
- PubMed Central ; Research for Development Repo...
- PubMed Central ; Research for Development Repo...
- NaN
- NaN
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged with cit...
- NaN
- 8514
- 2211-2855
- 3
- 1.0
- 352.0
-
-
- 8514
- 990
- 2211-2855
- 20490
- https://v2.sherpa.ac.uk/id/publisher_policy/3323
- no
- yes
- published
- cc_by
- 0
- NaN
- True
- Telethon Foundation
- http://dx.doi.org/10.13039/501100002426
- https://ror.org/04xraxn18
- it
- https://www.telethon.it/en/
- 325.0
- NaN
- institutional_repository ; named_repository ; ...
- Institutional Repository
- PubMed Central ; Research for Development Repo...
- PubMed Central ; Research for Development Repo...
- NaN
- NaN
- disciplinary (PubMed Central) ;
- True
- Published source must be acknowledged with cit...
- NaN
- 8515
- 2211-2855
- 3
- 1.0
- 352.0
-
-
-
-
5585 rows × 33 columns
-
-
-
-
-
-```python
-sherpa['prerequisite_funders'].value_counts()
-```
-
-
-
-
- True 5585
- Name: prerequisite_funders, dtype: int64
-
-
-
-
-```python
-funders = sherpa.loc[sherpa['prerequisite_funders'].notna()][['prerequisite_funders_name', 'prerequisite_funders_fundref', 'prerequisite_funders_ror', 'prerequisite_funders_country', 'prerequisite_funders_url', 'prerequisite_funders_sherpa_id']]
-funders
-```
-
-
-
-
-
-
-
-
-
-
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
-
-
-
-
- 16
- National Institutes of Health (NIH)
- http://dx.doi.org/10.13039/100000002
- https://ror.org/01cwqze88
- us
- http://www.nih.gov/
- 9.0
-
-
- 28
- National Institutes of Health (NIH)
- http://dx.doi.org/10.13039/100000002
- https://ror.org/01cwqze88
- us
- http://www.nih.gov/
- 9.0
-
-
- 58
- Wellcome Trust
- http://dx.doi.org/10.13039/100004440
- https://ror.org/029chgv08
- gb
- http://www.wellcome.ac.uk/
- 695.0
-
-
- 59
- British Heart Foundation (BHF)
- http://dx.doi.org/10.13039/501100000274
- https://ror.org/02wdwnk04
- gb
- http://www.bhf.org.uk/
- 18.0
-
-
- 60
- Versus Arthritis
- http://dx.doi.org/10.13039/501100000341
- https://ror.org/02jkpm469
- gb
- https://www.versusarthritis.org/
- 14.0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 8510
- European Research Council (ERC)
- http://dx.doi.org/10.13039/501100000781
- https://ror.org/0472cxd90
- be
- http://erc.europa.eu/
- 31.0
-
-
- 8511
- Medical Research Council (MRC)
- http://dx.doi.org/10.13039/501100000265
- https://ror.org/03x94j517
- gb
- http://www.mrc.ac.uk/index.htm
- 705.0
-
-
- 8512
- Motor Neuron Disease Association (MND Associat...
- http://dx.doi.org/10.13039/501100000406
- https://ror.org/02gq0fg61
- gb
- http://www.mndassociation.org/
- 562.0
-
-
- 8513
- Parkinson's UK
- http://dx.doi.org/10.13039/501100000304
- https://ror.org/02417p338
- gb
- http://www.parkinsons.org.uk/
- 411.0
-
-
- 8514
- Telethon Foundation
- http://dx.doi.org/10.13039/501100002426
- https://ror.org/04xraxn18
- it
- https://www.telethon.it/en/
- 325.0
-
-
-
-
5585 rows × 6 columns
-
-
-
-
-
-```python
-funders_dedup = funders.drop_duplicates(subset='prerequisite_funders_ror')
-funders_dedup
-```
-
-
-
-
-
-
-
-
-
-
- prerequisite_funders_name
- prerequisite_funders_fundref
- prerequisite_funders_ror
- prerequisite_funders_country
- prerequisite_funders_url
- prerequisite_funders_sherpa_id
-
-
-
-
- 16
- National Institutes of Health (NIH)
- http://dx.doi.org/10.13039/100000002
- https://ror.org/01cwqze88
- us
- http://www.nih.gov/
- 9.0
-
-
- 58
- Wellcome Trust
- http://dx.doi.org/10.13039/100004440
- https://ror.org/029chgv08
- gb
- http://www.wellcome.ac.uk/
- 695.0
-
-
- 59
- British Heart Foundation (BHF)
- http://dx.doi.org/10.13039/501100000274
- https://ror.org/02wdwnk04
- gb
- http://www.bhf.org.uk/
- 18.0
-
-
- 60
- Versus Arthritis
- http://dx.doi.org/10.13039/501100000341
- https://ror.org/02jkpm469
- gb
- https://www.versusarthritis.org/
- 14.0
-
-
- 61
- Biotechnology and Biological Sciences Research...
- http://dx.doi.org/10.13039/501100000268
- https://ror.org/00cwqg982
- gb
- http://www.bbsrc.ac.uk/home/home.aspx
- 709.0
-
-
- 62
- Blood Cancer UK
- http://dx.doi.org/10.13039/501100007903
- https://ror.org/0055acf80
- gb
- https://bloodcancer.org.uk/
- 925.0
-
-
- 63
- Bill & Melinda Gates Foundation
- http://dx.doi.org/10.13039/100000865
- https://ror.org/0456r8d26
- us
- http://www.gatesfoundation.org/
- 961.0
-
-
- 64
- Cancer Research UK
- http://dx.doi.org/10.13039/501100000289
- https://ror.org/054225q67
- gb
- http://www.cancerresearchuk.org/
- 19.0
-
-
- 65
- Chief Scientist Office, Scottish Executive (CSO)
- http://dx.doi.org/10.13039/501100000589
- https://ror.org/01613vh25
- gb
- http://www.cso.scot.nhs.uk/
- 16.0
-
-
- 66
- Department of Health (DH)
- http://dx.doi.org/10.13039/501100000272
- https://ror.org/0187kwz08
- gb
- http://www.dh.gov.uk/en/index.htm
- 943.0
-
-
- 67
- Dunhill Medical Trust (DMT)
- http://dx.doi.org/10.13039/501100000377
- https://ror.org/05ayqqv15
- gb
- https://dunhillmedical.org.uk/
- 410.0
-
-
- 68
- European Research Council (ERC)
- http://dx.doi.org/10.13039/501100000781
- https://ror.org/0472cxd90
- be
- http://erc.europa.eu/
- 31.0
-
-
- 69
- Medical Research Council (MRC)
- http://dx.doi.org/10.13039/501100000265
- https://ror.org/03x94j517
- gb
- http://www.mrc.ac.uk/index.htm
- 705.0
-
-
- 70
- Motor Neuron Disease Association (MND Associat...
- http://dx.doi.org/10.13039/501100000406
- https://ror.org/02gq0fg61
- gb
- http://www.mndassociation.org/
- 562.0
-
-
- 71
- Parkinson's UK
- http://dx.doi.org/10.13039/501100000304
- https://ror.org/02417p338
- gb
- http://www.parkinsons.org.uk/
- 411.0
-
-
- 72
- Telethon Foundation
- http://dx.doi.org/10.13039/501100002426
- https://ror.org/04xraxn18
- it
- https://www.telethon.it/en/
- 325.0
-
-
- 99
- Howard Hughes Medical Institute (HHMI)
- http://dx.doi.org/10.13039/100000011
- https://ror.org/006w34k90
- us
- http://www.hhmi.org/
- 24.0
-
-
- 149
- Arts and Humanities Research Council (AHRC)
- http://dx.doi.org/10.13039/501100000267
- https://ror.org/0505m1554
- gb
- http://www.ahrc.ac.uk/Pages/Home.aspx
- 698.0
-
-
- 150
- Austrian Science Fund (FWF)
- http://dx.doi.org/10.13039/501100002428
- https://ror.org/013tf3c58
- at
- http://www.fwf.ac.at/en/
- 13.0
-
-
- 153
- Breast Cancer Now
- http://dx.doi.org/10.13039/501100007913
- https://ror.org/02qa92s63
- gb
- http://breastcancernow.org/
- 1065.0
-
-
- 156
- Engineering and Physical Sciences Research Cou...
- http://dx.doi.org/10.13039/501100000266
- https://ror.org/0439y7842
- gb
- http://www.epsrc.ac.uk/Pages/default.aspx
- 722.0
-
-
- 159
- Natural Environment Research Council (NERC)
- http://dx.doi.org/10.13039/501100000270
- https://ror.org/02b5d8509
- gb
- https://nerc.ukri.org/
- 726.0
-
-
- 162
- Science and Technology Facilities Council (STFC)
- http://dx.doi.org/10.13039/501100000271
- https://ror.org/057g20z61
- gb
- http://www.stfc.ac.uk/
- 716.0
-
-
- 164
- Vetenskapsrådet
- http://dx.doi.org/10.13039/501100004359
- https://ror.org/03zttf063
- se
- http://www.vr.se/
- 302.0
-
-
- 165
- World Health Organization (WHO)
- http://dx.doi.org/10.13039/100004423
- https://ror.org/01f80g185
- ch
- http://www.who.int/
- 903.0
-
-
- 166
- World Bank
- http://dx.doi.org/10.13039/100004421
- https://ror.org/00ae7jd04
- us
- http://www.worldbank.org/
- 525.0
-
-
- 167
- Yorkshire Cancer Research
- http://dx.doi.org/10.13039/501100002653
- https://ror.org/02cddnn97
- gb
- http://www.yorkshirecancerresearch.org.uk/
- 428.0
-
-
- 169
- Economic and Social Research Council (ESRC)
- http://dx.doi.org/10.13039/501100000269
- https://ror.org/03n0ht308
- gb
- http://www.esrc.ac.uk/
- 717.0
-
-
- 418
- Higher Education Funding Council for England (...
- http://dx.doi.org/10.13039/501100000384
- https://ror.org/02wxr8x18
- gb
- http://www.hefce.ac.uk/
- 877.0
-
-
- 419
- Higher Education Funding Council for Wales (HE...
- http://dx.doi.org/10.13039/501100000383
- https://ror.org/056y81r79
- gb
- http://www.hefcw.ac.uk/home/home.aspx
- 881.0
-
-
- 420
- Scottish Funding Council (SFC)
- http://dx.doi.org/10.13039/501100000360
- https://ror.org/056bwcz71
- gb
- http://www.sfc.ac.uk/
- 887.0
-
-
- 421
- Department for the Economy, Northern Ireland
- http://dx.doi.org/10.13039/100008303
- https://ror.org/05w9mt194
- gb
- https://www.economy-ni.gov.uk/
- 884.0
-
-
- 960
- Academy of Finland
- http://dx.doi.org/10.13039/501100002341
- https://ror.org/05k73zm37
- fi
- https://www.aka.fi/en/
- 1248.0
-
-
- 961
- Agence Nationale de la Recherche (ANR)
- http://dx.doi.org/10.13039/501100001665
- https://ror.org/00rbzpz17
- fr
- http://www.agence-nationale-recherche.fr/
- 30.0
-
-
- 963
- Fundação para a Ciência e a Tecnologia
- http://dx.doi.org/10.13039/501100001871
- https://ror.org/00snfqn58
- pt
- http://www.fct.pt/
- 1109.0
-
-
- 964
- Formas
- http://dx.doi.org/10.13039/501100001862
- https://ror.org/03pjs1y45
- se
- http://www.formas.se/
- 452.0
-
-
- 967
- Nederlandse Organisatie voor Wetenschappelijk ...
- http://dx.doi.org/10.13039/501100003246
- https://ror.org/04jsz6e67
- nl
- http://www.nwo.nl/
- 459.0
-
-
- 968
- Science Foundation Ireland (SFI)
- http://dx.doi.org/10.13039/501100001602
- https://ror.org/0271asj38
- ie
- http://www.sfi.ie/
- 210.0
-
-
- 970
- Research Council of Norway
- http://dx.doi.org/10.13039/501100005416
- https://ror.org/00epmv149
- no
- https://www.forskningsradet.no/en/
- 266.0
-
-
- 971
- Forskningsrådet för hälsa, arbetsliv och välfä...
- http://dx.doi.org/10.13039/501100006636
- https://ror.org/02d290r06
- se
- http://www.forte.se/
- 455.0
-
-
- 978
- Innovate UK
- http://dx.doi.org/10.13039/501100000266
- https://ror.org/05ar5fy68
- gb
- https://www.gov.uk/government/organisations/in...
- 1267.0
-
-
- 1048
- Diabetes UK
- http://dx.doi.org/10.13039/501100000361
- https://ror.org/050rgn017
- gb
- http://www.diabetes.org.uk/
- 492.0
-
-
- 1052
- Marie Curie
- http://dx.doi.org/10.13039/501100000654
- https://ror.org/02aqv1x10
- gb
- http://www.mariecurie.org.uk/
- 595.0
-
-
- 1055
- Action on Hearing Loss
- http://dx.doi.org/10.13039/501100000703
- https://ror.org/05w6qh410
- gb
- http://www.actiononhearingloss.org.uk/
- 412.0
-
-
- 1056
- Alzheimer's Society
- http://dx.doi.org/10.13039/501100000320
- https://ror.org/0472gwq90
- gb
- http://alzheimers.org.uk/
- 443.0
-
-
- 1063
- Multiple Sclerosis Society
- http://dx.doi.org/10.13039/501100000381
- https://ror.org/043fwdk81
- gb
- http://www.mssociety.org.uk/
- 745.0
-
-
- 1064
- Myrovlytis Trust
- http://dx.doi.org/10.13039/501100001291
- https://ror.org/05bj02613
- gb
- http://www.myrovlytistrust.org/
- 858.0
-
-
- 1065
- National Centre for the Replacement, Refinemen...
- http://dx.doi.org/10.13039/501100000849
- https://ror.org/02w0kg036
- gb
- http://www.nc3rs.org.uk/
- 859.0
-
-
- 1072
- Worldwide Cancer Reseach
- http://dx.doi.org/10.13039/100004423
- https://ror.org/031tfbz57
- gb
- http://www.worldwidecancerresearch.org/
- 425.0
-
-
- 2219
- Canadian Institutes of Health Research (CIHR)
- http://dx.doi.org/10.13039/501100000024
- https://ror.org/01gavpb45
- ca
- http://www.cihr-irsc.gc.ca/
- 28.0
-
-
- 5490
- US Department of Energy (DOE)
- http://dx.doi.org/10.13039/100000015
- https://ror.org/01bj3aw27
- us
- http://energy.gov/
- 962.0
-
-
- 5491
- Agency for Healthcare Research and Quality (AHRQ)
- http://dx.doi.org/10.13039/100000133
- https://ror.org/03jmfdf59
- us
- http://www.ahrq.gov/index.html
- 981.0
-
-
- 5492
- Institute of Education Sciences (IES)
- http://dx.doi.org/10.13039/100005246
- https://ror.org/04et59085
- us
- http://ies.ed.gov/
- 291.0
-
-
- 5493
- National Aeronautics and Space Administration ...
- http://dx.doi.org/10.13039/100000104
- https://ror.org/027ka1x80
- us
- http://science.nasa.gov/
- 986.0
-
-
- 5494
- National Science Foundation (NSF)
- http://dx.doi.org/10.13039/100000001
- https://ror.org/021nxhr62
- us
- http://www.nsf.gov/
- 354.0
-
-
- 7232
- Academy of Medical Science
- http://dx.doi.org/10.13039/501100000691
- https://ror.org/00c489v88
- gb
- https://acmedsci.ac.uk/
- 1125.0
-
-
- 7239
- Prostate Cancer UK
- http://dx.doi.org/10.13039/501100000771
- https://ror.org/04dkv6329
- gb
- http://prostatecanceruk.org/
- 742.0
-
-
- 7240
- Schweizerischer Nationalfonds zur Förderung de...
- http://dx.doi.org/10.13039/501100001711
- https://ror.org/00yjd3n13
- ch
- http://www.snf.ch/de/Seiten/default.aspx
- 25.0
-
-
-
-
-
-
-
-
-```python
-funders_dedup.shape[0]
-```
-
-
-
-
- 58
-
-
-
-
-```python
-# export excel
-funders_dedup.to_excel('sample/funders.xlsx', index=False)
-```
-
-
-```python
-# export csv
-funders_dedup.to_csv('sample/funders.tsv', index=False)
-```
-
-
-```python
-# creation du DF
-organization_funders = funders_dedup
-organization_funders = organization_funders.rename(columns = {'prerequisite_funders_name' : 'name',
- 'prerequisite_funders_fundref' : 'fundref',
- 'prerequisite_funders_ror' : 'ror',
- 'prerequisite_funders_country' : 'iso_code',
- 'prerequisite_funders_url' : 'website',
- 'prerequisite_funders_sherpa_id' : 'sherpa_id'
- })
-organization_funders
-```
-
-
-
-
-
-
-
-
-
-
- name
- fundref
- ror
- iso_code
- website
- sherpa_id
-
-
-
-
- 16
- National Institutes of Health (NIH)
- http://dx.doi.org/10.13039/100000002
- https://ror.org/01cwqze88
- us
- http://www.nih.gov/
- 9.0
-
-
- 58
- Wellcome Trust
- http://dx.doi.org/10.13039/100004440
- https://ror.org/029chgv08
- gb
- http://www.wellcome.ac.uk/
- 695.0
-
-
- 59
- British Heart Foundation (BHF)
- http://dx.doi.org/10.13039/501100000274
- https://ror.org/02wdwnk04
- gb
- http://www.bhf.org.uk/
- 18.0
-
-
- 60
- Versus Arthritis
- http://dx.doi.org/10.13039/501100000341
- https://ror.org/02jkpm469
- gb
- https://www.versusarthritis.org/
- 14.0
-
-
- 61
- Biotechnology and Biological Sciences Research...
- http://dx.doi.org/10.13039/501100000268
- https://ror.org/00cwqg982
- gb
- http://www.bbsrc.ac.uk/home/home.aspx
- 709.0
-
-
- 62
- Blood Cancer UK
- http://dx.doi.org/10.13039/501100007903
- https://ror.org/0055acf80
- gb
- https://bloodcancer.org.uk/
- 925.0
-
-
- 63
- Bill & Melinda Gates Foundation
- http://dx.doi.org/10.13039/100000865
- https://ror.org/0456r8d26
- us
- http://www.gatesfoundation.org/
- 961.0
-
-
- 64
- Cancer Research UK
- http://dx.doi.org/10.13039/501100000289
- https://ror.org/054225q67
- gb
- http://www.cancerresearchuk.org/
- 19.0
-
-
- 65
- Chief Scientist Office, Scottish Executive (CSO)
- http://dx.doi.org/10.13039/501100000589
- https://ror.org/01613vh25
- gb
- http://www.cso.scot.nhs.uk/
- 16.0
-
-
- 66
- Department of Health (DH)
- http://dx.doi.org/10.13039/501100000272
- https://ror.org/0187kwz08
- gb
- http://www.dh.gov.uk/en/index.htm
- 943.0
-
-
- 67
- Dunhill Medical Trust (DMT)
- http://dx.doi.org/10.13039/501100000377
- https://ror.org/05ayqqv15
- gb
- https://dunhillmedical.org.uk/
- 410.0
-
-
- 68
- European Research Council (ERC)
- http://dx.doi.org/10.13039/501100000781
- https://ror.org/0472cxd90
- be
- http://erc.europa.eu/
- 31.0
-
-
- 69
- Medical Research Council (MRC)
- http://dx.doi.org/10.13039/501100000265
- https://ror.org/03x94j517
- gb
- http://www.mrc.ac.uk/index.htm
- 705.0
-
-
- 70
- Motor Neuron Disease Association (MND Associat...
- http://dx.doi.org/10.13039/501100000406
- https://ror.org/02gq0fg61
- gb
- http://www.mndassociation.org/
- 562.0
-
-
- 71
- Parkinson's UK
- http://dx.doi.org/10.13039/501100000304
- https://ror.org/02417p338
- gb
- http://www.parkinsons.org.uk/
- 411.0
-
-
- 72
- Telethon Foundation
- http://dx.doi.org/10.13039/501100002426
- https://ror.org/04xraxn18
- it
- https://www.telethon.it/en/
- 325.0
-
-
- 99
- Howard Hughes Medical Institute (HHMI)
- http://dx.doi.org/10.13039/100000011
- https://ror.org/006w34k90
- us
- http://www.hhmi.org/
- 24.0
-
-
- 149
- Arts and Humanities Research Council (AHRC)
- http://dx.doi.org/10.13039/501100000267
- https://ror.org/0505m1554
- gb
- http://www.ahrc.ac.uk/Pages/Home.aspx
- 698.0
-
-
- 150
- Austrian Science Fund (FWF)
- http://dx.doi.org/10.13039/501100002428
- https://ror.org/013tf3c58
- at
- http://www.fwf.ac.at/en/
- 13.0
-
-
- 153
- Breast Cancer Now
- http://dx.doi.org/10.13039/501100007913
- https://ror.org/02qa92s63
- gb
- http://breastcancernow.org/
- 1065.0
-
-
- 156
- Engineering and Physical Sciences Research Cou...
- http://dx.doi.org/10.13039/501100000266
- https://ror.org/0439y7842
- gb
- http://www.epsrc.ac.uk/Pages/default.aspx
- 722.0
-
-
- 159
- Natural Environment Research Council (NERC)
- http://dx.doi.org/10.13039/501100000270
- https://ror.org/02b5d8509
- gb
- https://nerc.ukri.org/
- 726.0
-
-
- 162
- Science and Technology Facilities Council (STFC)
- http://dx.doi.org/10.13039/501100000271
- https://ror.org/057g20z61
- gb
- http://www.stfc.ac.uk/
- 716.0
-
-
- 164
- Vetenskapsrådet
- http://dx.doi.org/10.13039/501100004359
- https://ror.org/03zttf063
- se
- http://www.vr.se/
- 302.0
-
-
- 165
- World Health Organization (WHO)
- http://dx.doi.org/10.13039/100004423
- https://ror.org/01f80g185
- ch
- http://www.who.int/
- 903.0
-
-
- 166
- World Bank
- http://dx.doi.org/10.13039/100004421
- https://ror.org/00ae7jd04
- us
- http://www.worldbank.org/
- 525.0
-
-
- 167
- Yorkshire Cancer Research
- http://dx.doi.org/10.13039/501100002653
- https://ror.org/02cddnn97
- gb
- http://www.yorkshirecancerresearch.org.uk/
- 428.0
-
-
- 169
- Economic and Social Research Council (ESRC)
- http://dx.doi.org/10.13039/501100000269
- https://ror.org/03n0ht308
- gb
- http://www.esrc.ac.uk/
- 717.0
-
-
- 418
- Higher Education Funding Council for England (...
- http://dx.doi.org/10.13039/501100000384
- https://ror.org/02wxr8x18
- gb
- http://www.hefce.ac.uk/
- 877.0
-
-
- 419
- Higher Education Funding Council for Wales (HE...
- http://dx.doi.org/10.13039/501100000383
- https://ror.org/056y81r79
- gb
- http://www.hefcw.ac.uk/home/home.aspx
- 881.0
-
-
- 420
- Scottish Funding Council (SFC)
- http://dx.doi.org/10.13039/501100000360
- https://ror.org/056bwcz71
- gb
- http://www.sfc.ac.uk/
- 887.0
-
-
- 421
- Department for the Economy, Northern Ireland
- http://dx.doi.org/10.13039/100008303
- https://ror.org/05w9mt194
- gb
- https://www.economy-ni.gov.uk/
- 884.0
-
-
- 960
- Academy of Finland
- http://dx.doi.org/10.13039/501100002341
- https://ror.org/05k73zm37
- fi
- https://www.aka.fi/en/
- 1248.0
-
-
- 961
- Agence Nationale de la Recherche (ANR)
- http://dx.doi.org/10.13039/501100001665
- https://ror.org/00rbzpz17
- fr
- http://www.agence-nationale-recherche.fr/
- 30.0
-
-
- 963
- Fundação para a Ciência e a Tecnologia
- http://dx.doi.org/10.13039/501100001871
- https://ror.org/00snfqn58
- pt
- http://www.fct.pt/
- 1109.0
-
-
- 964
- Formas
- http://dx.doi.org/10.13039/501100001862
- https://ror.org/03pjs1y45
- se
- http://www.formas.se/
- 452.0
-
-
- 967
- Nederlandse Organisatie voor Wetenschappelijk ...
- http://dx.doi.org/10.13039/501100003246
- https://ror.org/04jsz6e67
- nl
- http://www.nwo.nl/
- 459.0
-
-
- 968
- Science Foundation Ireland (SFI)
- http://dx.doi.org/10.13039/501100001602
- https://ror.org/0271asj38
- ie
- http://www.sfi.ie/
- 210.0
-
-
- 970
- Research Council of Norway
- http://dx.doi.org/10.13039/501100005416
- https://ror.org/00epmv149
- no
- https://www.forskningsradet.no/en/
- 266.0
-
-
- 971
- Forskningsrådet för hälsa, arbetsliv och välfä...
- http://dx.doi.org/10.13039/501100006636
- https://ror.org/02d290r06
- se
- http://www.forte.se/
- 455.0
-
-
- 978
- Innovate UK
- http://dx.doi.org/10.13039/501100000266
- https://ror.org/05ar5fy68
- gb
- https://www.gov.uk/government/organisations/in...
- 1267.0
-
-
- 1048
- Diabetes UK
- http://dx.doi.org/10.13039/501100000361
- https://ror.org/050rgn017
- gb
- http://www.diabetes.org.uk/
- 492.0
-
-
- 1052
- Marie Curie
- http://dx.doi.org/10.13039/501100000654
- https://ror.org/02aqv1x10
- gb
- http://www.mariecurie.org.uk/
- 595.0
-
-
- 1055
- Action on Hearing Loss
- http://dx.doi.org/10.13039/501100000703
- https://ror.org/05w6qh410
- gb
- http://www.actiononhearingloss.org.uk/
- 412.0
-
-
- 1056
- Alzheimer's Society
- http://dx.doi.org/10.13039/501100000320
- https://ror.org/0472gwq90
- gb
- http://alzheimers.org.uk/
- 443.0
-
-
- 1063
- Multiple Sclerosis Society
- http://dx.doi.org/10.13039/501100000381
- https://ror.org/043fwdk81
- gb
- http://www.mssociety.org.uk/
- 745.0
-
-
- 1064
- Myrovlytis Trust
- http://dx.doi.org/10.13039/501100001291
- https://ror.org/05bj02613
- gb
- http://www.myrovlytistrust.org/
- 858.0
-
-
- 1065
- National Centre for the Replacement, Refinemen...
- http://dx.doi.org/10.13039/501100000849
- https://ror.org/02w0kg036
- gb
- http://www.nc3rs.org.uk/
- 859.0
-
-
- 1072
- Worldwide Cancer Reseach
- http://dx.doi.org/10.13039/100004423
- https://ror.org/031tfbz57
- gb
- http://www.worldwidecancerresearch.org/
- 425.0
-
-
- 2219
- Canadian Institutes of Health Research (CIHR)
- http://dx.doi.org/10.13039/501100000024
- https://ror.org/01gavpb45
- ca
- http://www.cihr-irsc.gc.ca/
- 28.0
-
-
- 5490
- US Department of Energy (DOE)
- http://dx.doi.org/10.13039/100000015
- https://ror.org/01bj3aw27
- us
- http://energy.gov/
- 962.0
-
-
- 5491
- Agency for Healthcare Research and Quality (AHRQ)
- http://dx.doi.org/10.13039/100000133
- https://ror.org/03jmfdf59
- us
- http://www.ahrq.gov/index.html
- 981.0
-
-
- 5492
- Institute of Education Sciences (IES)
- http://dx.doi.org/10.13039/100005246
- https://ror.org/04et59085
- us
- http://ies.ed.gov/
- 291.0
-
-
- 5493
- National Aeronautics and Space Administration ...
- http://dx.doi.org/10.13039/100000104
- https://ror.org/027ka1x80
- us
- http://science.nasa.gov/
- 986.0
-
-
- 5494
- National Science Foundation (NSF)
- http://dx.doi.org/10.13039/100000001
- https://ror.org/021nxhr62
- us
- http://www.nsf.gov/
- 354.0
-
-
- 7232
- Academy of Medical Science
- http://dx.doi.org/10.13039/501100000691
- https://ror.org/00c489v88
- gb
- https://acmedsci.ac.uk/
- 1125.0
-
-
- 7239
- Prostate Cancer UK
- http://dx.doi.org/10.13039/501100000771
- https://ror.org/04dkv6329
- gb
- http://prostatecanceruk.org/
- 742.0
-
-
- 7240
- Schweizerischer Nationalfonds zur Förderung de...
- http://dx.doi.org/10.13039/501100001711
- https://ror.org/00yjd3n13
- ch
- http://www.snf.ch/de/Seiten/default.aspx
- 25.0
-
-
-
-
-
-
-
-
-```python
-# lien avec les pays
-country = pd.read_csv('sample/country.tsv', encoding='utf-8', header=0, sep='\t')
-country
-```
-
-
-
-
-
-
-
-
-
-
- name
- iso_code
- id
-
-
-
-
- 0
- Afghanistan
- AF
- 1
-
-
- 1
- Albania
- AL
- 2
-
-
- 2
- Algeria
- DZ
- 3
-
-
- 3
- American Samoa
- AS
- 4
-
-
- 4
- Andorra
- AD
- 5
-
-
- ...
- ...
- ...
- ...
-
-
- 246
- Zambia
- ZM
- 247
-
-
- 247
- Zimbabwe
- ZW
- 248
-
-
- 248
- Åland Islands
- AX
- 249
-
-
- 249
- International Agency
- OI
- 250
-
-
- 250
- UNKNOWN
- __
- 999999
-
-
-
-
251 rows × 3 columns
-
-
-
-
-
-```python
-# merge avec les pays
-organization_funders['iso_code'] = organization_funders['iso_code'].str.upper()
-organization_funders['is_funder'] = 1
-organization_funders = pd.merge(organization_funders, country[['iso_code', 'id']], how='left', on='iso_code')
-organization_funders
-```
-
-
-
-
-
-
-
-
-
-
- name
- fundref
- ror
- iso_code
- website
- sherpa_id
- is_funder
- id
-
-
-
-
- 0
- National Institutes of Health (NIH)
- http://dx.doi.org/10.13039/100000002
- https://ror.org/01cwqze88
- US
- http://www.nih.gov/
- 9.0
- 1
- 236
-
-
- 1
- Wellcome Trust
- http://dx.doi.org/10.13039/100004440
- https://ror.org/029chgv08
- GB
- http://www.wellcome.ac.uk/
- 695.0
- 1
- 234
-
-
- 2
- British Heart Foundation (BHF)
- http://dx.doi.org/10.13039/501100000274
- https://ror.org/02wdwnk04
- GB
- http://www.bhf.org.uk/
- 18.0
- 1
- 234
-
-
- 3
- Versus Arthritis
- http://dx.doi.org/10.13039/501100000341
- https://ror.org/02jkpm469
- GB
- https://www.versusarthritis.org/
- 14.0
- 1
- 234
-
-
- 4
- Biotechnology and Biological Sciences Research...
- http://dx.doi.org/10.13039/501100000268
- https://ror.org/00cwqg982
- GB
- http://www.bbsrc.ac.uk/home/home.aspx
- 709.0
- 1
- 234
-
-
- 5
- Blood Cancer UK
- http://dx.doi.org/10.13039/501100007903
- https://ror.org/0055acf80
- GB
- https://bloodcancer.org.uk/
- 925.0
- 1
- 234
-
-
- 6
- Bill & Melinda Gates Foundation
- http://dx.doi.org/10.13039/100000865
- https://ror.org/0456r8d26
- US
- http://www.gatesfoundation.org/
- 961.0
- 1
- 236
-
-
- 7
- Cancer Research UK
- http://dx.doi.org/10.13039/501100000289
- https://ror.org/054225q67
- GB
- http://www.cancerresearchuk.org/
- 19.0
- 1
- 234
-
-
- 8
- Chief Scientist Office, Scottish Executive (CSO)
- http://dx.doi.org/10.13039/501100000589
- https://ror.org/01613vh25
- GB
- http://www.cso.scot.nhs.uk/
- 16.0
- 1
- 234
-
-
- 9
- Department of Health (DH)
- http://dx.doi.org/10.13039/501100000272
- https://ror.org/0187kwz08
- GB
- http://www.dh.gov.uk/en/index.htm
- 943.0
- 1
- 234
-
-
- 10
- Dunhill Medical Trust (DMT)
- http://dx.doi.org/10.13039/501100000377
- https://ror.org/05ayqqv15
- GB
- https://dunhillmedical.org.uk/
- 410.0
- 1
- 234
-
-
- 11
- European Research Council (ERC)
- http://dx.doi.org/10.13039/501100000781
- https://ror.org/0472cxd90
- BE
- http://erc.europa.eu/
- 31.0
- 1
- 21
-
-
- 12
- Medical Research Council (MRC)
- http://dx.doi.org/10.13039/501100000265
- https://ror.org/03x94j517
- GB
- http://www.mrc.ac.uk/index.htm
- 705.0
- 1
- 234
-
-
- 13
- Motor Neuron Disease Association (MND Associat...
- http://dx.doi.org/10.13039/501100000406
- https://ror.org/02gq0fg61
- GB
- http://www.mndassociation.org/
- 562.0
- 1
- 234
-
-
- 14
- Parkinson's UK
- http://dx.doi.org/10.13039/501100000304
- https://ror.org/02417p338
- GB
- http://www.parkinsons.org.uk/
- 411.0
- 1
- 234
-
-
- 15
- Telethon Foundation
- http://dx.doi.org/10.13039/501100002426
- https://ror.org/04xraxn18
- IT
- https://www.telethon.it/en/
- 325.0
- 1
- 110
-
-
- 16
- Howard Hughes Medical Institute (HHMI)
- http://dx.doi.org/10.13039/100000011
- https://ror.org/006w34k90
- US
- http://www.hhmi.org/
- 24.0
- 1
- 236
-
-
- 17
- Arts and Humanities Research Council (AHRC)
- http://dx.doi.org/10.13039/501100000267
- https://ror.org/0505m1554
- GB
- http://www.ahrc.ac.uk/Pages/Home.aspx
- 698.0
- 1
- 234
-
-
- 18
- Austrian Science Fund (FWF)
- http://dx.doi.org/10.13039/501100002428
- https://ror.org/013tf3c58
- AT
- http://www.fwf.ac.at/en/
- 13.0
- 1
- 14
-
-
- 19
- Breast Cancer Now
- http://dx.doi.org/10.13039/501100007913
- https://ror.org/02qa92s63
- GB
- http://breastcancernow.org/
- 1065.0
- 1
- 234
-
-
- 20
- Engineering and Physical Sciences Research Cou...
- http://dx.doi.org/10.13039/501100000266
- https://ror.org/0439y7842
- GB
- http://www.epsrc.ac.uk/Pages/default.aspx
- 722.0
- 1
- 234
-
-
- 21
- Natural Environment Research Council (NERC)
- http://dx.doi.org/10.13039/501100000270
- https://ror.org/02b5d8509
- GB
- https://nerc.ukri.org/
- 726.0
- 1
- 234
-
-
- 22
- Science and Technology Facilities Council (STFC)
- http://dx.doi.org/10.13039/501100000271
- https://ror.org/057g20z61
- GB
- http://www.stfc.ac.uk/
- 716.0
- 1
- 234
-
-
- 23
- Vetenskapsrådet
- http://dx.doi.org/10.13039/501100004359
- https://ror.org/03zttf063
- SE
- http://www.vr.se/
- 302.0
- 1
- 214
-
-
- 24
- World Health Organization (WHO)
- http://dx.doi.org/10.13039/100004423
- https://ror.org/01f80g185
- CH
- http://www.who.int/
- 903.0
- 1
- 215
-
-
- 25
- World Bank
- http://dx.doi.org/10.13039/100004421
- https://ror.org/00ae7jd04
- US
- http://www.worldbank.org/
- 525.0
- 1
- 236
-
-
- 26
- Yorkshire Cancer Research
- http://dx.doi.org/10.13039/501100002653
- https://ror.org/02cddnn97
- GB
- http://www.yorkshirecancerresearch.org.uk/
- 428.0
- 1
- 234
-
-
- 27
- Economic and Social Research Council (ESRC)
- http://dx.doi.org/10.13039/501100000269
- https://ror.org/03n0ht308
- GB
- http://www.esrc.ac.uk/
- 717.0
- 1
- 234
-
-
- 28
- Higher Education Funding Council for England (...
- http://dx.doi.org/10.13039/501100000384
- https://ror.org/02wxr8x18
- GB
- http://www.hefce.ac.uk/
- 877.0
- 1
- 234
-
-
- 29
- Higher Education Funding Council for Wales (HE...
- http://dx.doi.org/10.13039/501100000383
- https://ror.org/056y81r79
- GB
- http://www.hefcw.ac.uk/home/home.aspx
- 881.0
- 1
- 234
-
-
- 30
- Scottish Funding Council (SFC)
- http://dx.doi.org/10.13039/501100000360
- https://ror.org/056bwcz71
- GB
- http://www.sfc.ac.uk/
- 887.0
- 1
- 234
-
-
- 31
- Department for the Economy, Northern Ireland
- http://dx.doi.org/10.13039/100008303
- https://ror.org/05w9mt194
- GB
- https://www.economy-ni.gov.uk/
- 884.0
- 1
- 234
-
-
- 32
- Academy of Finland
- http://dx.doi.org/10.13039/501100002341
- https://ror.org/05k73zm37
- FI
- https://www.aka.fi/en/
- 1248.0
- 1
- 75
-
-
- 33
- Agence Nationale de la Recherche (ANR)
- http://dx.doi.org/10.13039/501100001665
- https://ror.org/00rbzpz17
- FR
- http://www.agence-nationale-recherche.fr/
- 30.0
- 1
- 76
-
-
- 34
- Fundação para a Ciência e a Tecnologia
- http://dx.doi.org/10.13039/501100001871
- https://ror.org/00snfqn58
- PT
- http://www.fct.pt/
- 1109.0
- 1
- 178
-
-
- 35
- Formas
- http://dx.doi.org/10.13039/501100001862
- https://ror.org/03pjs1y45
- SE
- http://www.formas.se/
- 452.0
- 1
- 214
-
-
- 36
- Nederlandse Organisatie voor Wetenschappelijk ...
- http://dx.doi.org/10.13039/501100003246
- https://ror.org/04jsz6e67
- NL
- http://www.nwo.nl/
- 459.0
- 1
- 156
-
-
- 37
- Science Foundation Ireland (SFI)
- http://dx.doi.org/10.13039/501100001602
- https://ror.org/0271asj38
- IE
- http://www.sfi.ie/
- 210.0
- 1
- 107
-
-
- 38
- Research Council of Norway
- http://dx.doi.org/10.13039/501100005416
- https://ror.org/00epmv149
- NO
- https://www.forskningsradet.no/en/
- 266.0
- 1
- 166
-
-
- 39
- Forskningsrådet för hälsa, arbetsliv och välfä...
- http://dx.doi.org/10.13039/501100006636
- https://ror.org/02d290r06
- SE
- http://www.forte.se/
- 455.0
- 1
- 214
-
-
- 40
- Innovate UK
- http://dx.doi.org/10.13039/501100000266
- https://ror.org/05ar5fy68
- GB
- https://www.gov.uk/government/organisations/in...
- 1267.0
- 1
- 234
-
-
- 41
- Diabetes UK
- http://dx.doi.org/10.13039/501100000361
- https://ror.org/050rgn017
- GB
- http://www.diabetes.org.uk/
- 492.0
- 1
- 234
-
-
- 42
- Marie Curie
- http://dx.doi.org/10.13039/501100000654
- https://ror.org/02aqv1x10
- GB
- http://www.mariecurie.org.uk/
- 595.0
- 1
- 234
-
-
- 43
- Action on Hearing Loss
- http://dx.doi.org/10.13039/501100000703
- https://ror.org/05w6qh410
- GB
- http://www.actiononhearingloss.org.uk/
- 412.0
- 1
- 234
-
-
- 44
- Alzheimer's Society
- http://dx.doi.org/10.13039/501100000320
- https://ror.org/0472gwq90
- GB
- http://alzheimers.org.uk/
- 443.0
- 1
- 234
-
-
- 45
- Multiple Sclerosis Society
- http://dx.doi.org/10.13039/501100000381
- https://ror.org/043fwdk81
- GB
- http://www.mssociety.org.uk/
- 745.0
- 1
- 234
-
-
- 46
- Myrovlytis Trust
- http://dx.doi.org/10.13039/501100001291
- https://ror.org/05bj02613
- GB
- http://www.myrovlytistrust.org/
- 858.0
- 1
- 234
-
-
- 47
- National Centre for the Replacement, Refinemen...
- http://dx.doi.org/10.13039/501100000849
- https://ror.org/02w0kg036
- GB
- http://www.nc3rs.org.uk/
- 859.0
- 1
- 234
-
-
- 48
- Worldwide Cancer Reseach
- http://dx.doi.org/10.13039/100004423
- https://ror.org/031tfbz57
- GB
- http://www.worldwidecancerresearch.org/
- 425.0
- 1
- 234
-
-
- 49
- Canadian Institutes of Health Research (CIHR)
- http://dx.doi.org/10.13039/501100000024
- https://ror.org/01gavpb45
- CA
- http://www.cihr-irsc.gc.ca/
- 28.0
- 1
- 40
-
-
- 50
- US Department of Energy (DOE)
- http://dx.doi.org/10.13039/100000015
- https://ror.org/01bj3aw27
- US
- http://energy.gov/
- 962.0
- 1
- 236
-
-
- 51
- Agency for Healthcare Research and Quality (AHRQ)
- http://dx.doi.org/10.13039/100000133
- https://ror.org/03jmfdf59
- US
- http://www.ahrq.gov/index.html
- 981.0
- 1
- 236
-
-
- 52
- Institute of Education Sciences (IES)
- http://dx.doi.org/10.13039/100005246
- https://ror.org/04et59085
- US
- http://ies.ed.gov/
- 291.0
- 1
- 236
-
-
- 53
- National Aeronautics and Space Administration ...
- http://dx.doi.org/10.13039/100000104
- https://ror.org/027ka1x80
- US
- http://science.nasa.gov/
- 986.0
- 1
- 236
-
-
- 54
- National Science Foundation (NSF)
- http://dx.doi.org/10.13039/100000001
- https://ror.org/021nxhr62
- US
- http://www.nsf.gov/
- 354.0
- 1
- 236
-
-
- 55
- Academy of Medical Science
- http://dx.doi.org/10.13039/501100000691
- https://ror.org/00c489v88
- GB
- https://acmedsci.ac.uk/
- 1125.0
- 1
- 234
-
-
- 56
- Prostate Cancer UK
- http://dx.doi.org/10.13039/501100000771
- https://ror.org/04dkv6329
- GB
- http://prostatecanceruk.org/
- 742.0
- 1
- 234
-
-
- 57
- Schweizerischer Nationalfonds zur Förderung de...
- http://dx.doi.org/10.13039/501100001711
- https://ror.org/00yjd3n13
- CH
- http://www.snf.ch/de/Seiten/default.aspx
- 25.0
- 1
- 215
-
-
-
-
-
-
-
-
-```python
-organization_funders = organization_funders.rename(columns = {'id' : 'country'})
-organization_funders
-```
-
-
-
-
-
-
-
-
-
-
- name
- fundref
- ror
- iso_code
- website
- sherpa_id
- is_funder
- country
-
-
-
-
- 0
- National Institutes of Health (NIH)
- http://dx.doi.org/10.13039/100000002
- https://ror.org/01cwqze88
- US
- http://www.nih.gov/
- 9.0
- 1
- 236
-
-
- 1
- Wellcome Trust
- http://dx.doi.org/10.13039/100004440
- https://ror.org/029chgv08
- GB
- http://www.wellcome.ac.uk/
- 695.0
- 1
- 234
-
-
- 2
- British Heart Foundation (BHF)
- http://dx.doi.org/10.13039/501100000274
- https://ror.org/02wdwnk04
- GB
- http://www.bhf.org.uk/
- 18.0
- 1
- 234
-
-
- 3
- Versus Arthritis
- http://dx.doi.org/10.13039/501100000341
- https://ror.org/02jkpm469
- GB
- https://www.versusarthritis.org/
- 14.0
- 1
- 234
-
-
- 4
- Biotechnology and Biological Sciences Research...
- http://dx.doi.org/10.13039/501100000268
- https://ror.org/00cwqg982
- GB
- http://www.bbsrc.ac.uk/home/home.aspx
- 709.0
- 1
- 234
-
-
- 5
- Blood Cancer UK
- http://dx.doi.org/10.13039/501100007903
- https://ror.org/0055acf80
- GB
- https://bloodcancer.org.uk/
- 925.0
- 1
- 234
-
-
- 6
- Bill & Melinda Gates Foundation
- http://dx.doi.org/10.13039/100000865
- https://ror.org/0456r8d26
- US
- http://www.gatesfoundation.org/
- 961.0
- 1
- 236
-
-
- 7
- Cancer Research UK
- http://dx.doi.org/10.13039/501100000289
- https://ror.org/054225q67
- GB
- http://www.cancerresearchuk.org/
- 19.0
- 1
- 234
-
-
- 8
- Chief Scientist Office, Scottish Executive (CSO)
- http://dx.doi.org/10.13039/501100000589
- https://ror.org/01613vh25
- GB
- http://www.cso.scot.nhs.uk/
- 16.0
- 1
- 234
-
-
- 9
- Department of Health (DH)
- http://dx.doi.org/10.13039/501100000272
- https://ror.org/0187kwz08
- GB
- http://www.dh.gov.uk/en/index.htm
- 943.0
- 1
- 234
-
-
- 10
- Dunhill Medical Trust (DMT)
- http://dx.doi.org/10.13039/501100000377
- https://ror.org/05ayqqv15
- GB
- https://dunhillmedical.org.uk/
- 410.0
- 1
- 234
-
-
- 11
- European Research Council (ERC)
- http://dx.doi.org/10.13039/501100000781
- https://ror.org/0472cxd90
- BE
- http://erc.europa.eu/
- 31.0
- 1
- 21
-
-
- 12
- Medical Research Council (MRC)
- http://dx.doi.org/10.13039/501100000265
- https://ror.org/03x94j517
- GB
- http://www.mrc.ac.uk/index.htm
- 705.0
- 1
- 234
-
-
- 13
- Motor Neuron Disease Association (MND Associat...
- http://dx.doi.org/10.13039/501100000406
- https://ror.org/02gq0fg61
- GB
- http://www.mndassociation.org/
- 562.0
- 1
- 234
-
-
- 14
- Parkinson's UK
- http://dx.doi.org/10.13039/501100000304
- https://ror.org/02417p338
- GB
- http://www.parkinsons.org.uk/
- 411.0
- 1
- 234
-
-
- 15
- Telethon Foundation
- http://dx.doi.org/10.13039/501100002426
- https://ror.org/04xraxn18
- IT
- https://www.telethon.it/en/
- 325.0
- 1
- 110
-
-
- 16
- Howard Hughes Medical Institute (HHMI)
- http://dx.doi.org/10.13039/100000011
- https://ror.org/006w34k90
- US
- http://www.hhmi.org/
- 24.0
- 1
- 236
-
-
- 17
- Arts and Humanities Research Council (AHRC)
- http://dx.doi.org/10.13039/501100000267
- https://ror.org/0505m1554
- GB
- http://www.ahrc.ac.uk/Pages/Home.aspx
- 698.0
- 1
- 234
-
-
- 18
- Austrian Science Fund (FWF)
- http://dx.doi.org/10.13039/501100002428
- https://ror.org/013tf3c58
- AT
- http://www.fwf.ac.at/en/
- 13.0
- 1
- 14
-
-
- 19
- Breast Cancer Now
- http://dx.doi.org/10.13039/501100007913
- https://ror.org/02qa92s63
- GB
- http://breastcancernow.org/
- 1065.0
- 1
- 234
-
-
- 20
- Engineering and Physical Sciences Research Cou...
- http://dx.doi.org/10.13039/501100000266
- https://ror.org/0439y7842
- GB
- http://www.epsrc.ac.uk/Pages/default.aspx
- 722.0
- 1
- 234
-
-
- 21
- Natural Environment Research Council (NERC)
- http://dx.doi.org/10.13039/501100000270
- https://ror.org/02b5d8509
- GB
- https://nerc.ukri.org/
- 726.0
- 1
- 234
-
-
- 22
- Science and Technology Facilities Council (STFC)
- http://dx.doi.org/10.13039/501100000271
- https://ror.org/057g20z61
- GB
- http://www.stfc.ac.uk/
- 716.0
- 1
- 234
-
-
- 23
- Vetenskapsrådet
- http://dx.doi.org/10.13039/501100004359
- https://ror.org/03zttf063
- SE
- http://www.vr.se/
- 302.0
- 1
- 214
-
-
- 24
- World Health Organization (WHO)
- http://dx.doi.org/10.13039/100004423
- https://ror.org/01f80g185
- CH
- http://www.who.int/
- 903.0
- 1
- 215
-
-
- 25
- World Bank
- http://dx.doi.org/10.13039/100004421
- https://ror.org/00ae7jd04
- US
- http://www.worldbank.org/
- 525.0
- 1
- 236
-
-
- 26
- Yorkshire Cancer Research
- http://dx.doi.org/10.13039/501100002653
- https://ror.org/02cddnn97
- GB
- http://www.yorkshirecancerresearch.org.uk/
- 428.0
- 1
- 234
-
-
- 27
- Economic and Social Research Council (ESRC)
- http://dx.doi.org/10.13039/501100000269
- https://ror.org/03n0ht308
- GB
- http://www.esrc.ac.uk/
- 717.0
- 1
- 234
-
-
- 28
- Higher Education Funding Council for England (...
- http://dx.doi.org/10.13039/501100000384
- https://ror.org/02wxr8x18
- GB
- http://www.hefce.ac.uk/
- 877.0
- 1
- 234
-
-
- 29
- Higher Education Funding Council for Wales (HE...
- http://dx.doi.org/10.13039/501100000383
- https://ror.org/056y81r79
- GB
- http://www.hefcw.ac.uk/home/home.aspx
- 881.0
- 1
- 234
-
-
- 30
- Scottish Funding Council (SFC)
- http://dx.doi.org/10.13039/501100000360
- https://ror.org/056bwcz71
- GB
- http://www.sfc.ac.uk/
- 887.0
- 1
- 234
-
-
- 31
- Department for the Economy, Northern Ireland
- http://dx.doi.org/10.13039/100008303
- https://ror.org/05w9mt194
- GB
- https://www.economy-ni.gov.uk/
- 884.0
- 1
- 234
-
-
- 32
- Academy of Finland
- http://dx.doi.org/10.13039/501100002341
- https://ror.org/05k73zm37
- FI
- https://www.aka.fi/en/
- 1248.0
- 1
- 75
-
-
- 33
- Agence Nationale de la Recherche (ANR)
- http://dx.doi.org/10.13039/501100001665
- https://ror.org/00rbzpz17
- FR
- http://www.agence-nationale-recherche.fr/
- 30.0
- 1
- 76
-
-
- 34
- Fundação para a Ciência e a Tecnologia
- http://dx.doi.org/10.13039/501100001871
- https://ror.org/00snfqn58
- PT
- http://www.fct.pt/
- 1109.0
- 1
- 178
-
-
- 35
- Formas
- http://dx.doi.org/10.13039/501100001862
- https://ror.org/03pjs1y45
- SE
- http://www.formas.se/
- 452.0
- 1
- 214
-
-
- 36
- Nederlandse Organisatie voor Wetenschappelijk ...
- http://dx.doi.org/10.13039/501100003246
- https://ror.org/04jsz6e67
- NL
- http://www.nwo.nl/
- 459.0
- 1
- 156
-
-
- 37
- Science Foundation Ireland (SFI)
- http://dx.doi.org/10.13039/501100001602
- https://ror.org/0271asj38
- IE
- http://www.sfi.ie/
- 210.0
- 1
- 107
-
-
- 38
- Research Council of Norway
- http://dx.doi.org/10.13039/501100005416
- https://ror.org/00epmv149
- NO
- https://www.forskningsradet.no/en/
- 266.0
- 1
- 166
-
-
- 39
- Forskningsrådet för hälsa, arbetsliv och välfä...
- http://dx.doi.org/10.13039/501100006636
- https://ror.org/02d290r06
- SE
- http://www.forte.se/
- 455.0
- 1
- 214
-
-
- 40
- Innovate UK
- http://dx.doi.org/10.13039/501100000266
- https://ror.org/05ar5fy68
- GB
- https://www.gov.uk/government/organisations/in...
- 1267.0
- 1
- 234
-
-
- 41
- Diabetes UK
- http://dx.doi.org/10.13039/501100000361
- https://ror.org/050rgn017
- GB
- http://www.diabetes.org.uk/
- 492.0
- 1
- 234
-
-
- 42
- Marie Curie
- http://dx.doi.org/10.13039/501100000654
- https://ror.org/02aqv1x10
- GB
- http://www.mariecurie.org.uk/
- 595.0
- 1
- 234
-
-
- 43
- Action on Hearing Loss
- http://dx.doi.org/10.13039/501100000703
- https://ror.org/05w6qh410
- GB
- http://www.actiononhearingloss.org.uk/
- 412.0
- 1
- 234
-
-
- 44
- Alzheimer's Society
- http://dx.doi.org/10.13039/501100000320
- https://ror.org/0472gwq90
- GB
- http://alzheimers.org.uk/
- 443.0
- 1
- 234
-
-
- 45
- Multiple Sclerosis Society
- http://dx.doi.org/10.13039/501100000381
- https://ror.org/043fwdk81
- GB
- http://www.mssociety.org.uk/
- 745.0
- 1
- 234
-
-
- 46
- Myrovlytis Trust
- http://dx.doi.org/10.13039/501100001291
- https://ror.org/05bj02613
- GB
- http://www.myrovlytistrust.org/
- 858.0
- 1
- 234
-
-
- 47
- National Centre for the Replacement, Refinemen...
- http://dx.doi.org/10.13039/501100000849
- https://ror.org/02w0kg036
- GB
- http://www.nc3rs.org.uk/
- 859.0
- 1
- 234
-
-
- 48
- Worldwide Cancer Reseach
- http://dx.doi.org/10.13039/100004423
- https://ror.org/031tfbz57
- GB
- http://www.worldwidecancerresearch.org/
- 425.0
- 1
- 234
-
-
- 49
- Canadian Institutes of Health Research (CIHR)
- http://dx.doi.org/10.13039/501100000024
- https://ror.org/01gavpb45
- CA
- http://www.cihr-irsc.gc.ca/
- 28.0
- 1
- 40
-
-
- 50
- US Department of Energy (DOE)
- http://dx.doi.org/10.13039/100000015
- https://ror.org/01bj3aw27
- US
- http://energy.gov/
- 962.0
- 1
- 236
-
-
- 51
- Agency for Healthcare Research and Quality (AHRQ)
- http://dx.doi.org/10.13039/100000133
- https://ror.org/03jmfdf59
- US
- http://www.ahrq.gov/index.html
- 981.0
- 1
- 236
-
-
- 52
- Institute of Education Sciences (IES)
- http://dx.doi.org/10.13039/100005246
- https://ror.org/04et59085
- US
- http://ies.ed.gov/
- 291.0
- 1
- 236
-
-
- 53
- National Aeronautics and Space Administration ...
- http://dx.doi.org/10.13039/100000104
- https://ror.org/027ka1x80
- US
- http://science.nasa.gov/
- 986.0
- 1
- 236
-
-
- 54
- National Science Foundation (NSF)
- http://dx.doi.org/10.13039/100000001
- https://ror.org/021nxhr62
- US
- http://www.nsf.gov/
- 354.0
- 1
- 236
-
-
- 55
- Academy of Medical Science
- http://dx.doi.org/10.13039/501100000691
- https://ror.org/00c489v88
- GB
- https://acmedsci.ac.uk/
- 1125.0
- 1
- 234
-
-
- 56
- Prostate Cancer UK
- http://dx.doi.org/10.13039/501100000771
- https://ror.org/04dkv6329
- GB
- http://prostatecanceruk.org/
- 742.0
- 1
- 234
-
-
- 57
- Schweizerischer Nationalfonds zur Förderung de...
- http://dx.doi.org/10.13039/501100001711
- https://ror.org/00yjd3n13
- CH
- http://www.snf.ch/de/Seiten/default.aspx
- 25.0
- 1
- 215
-
-
-
-
-
-
-
-
-```python
-# ajout des organizations suisses
-organization = pd.read_csv('ror/ror_ch_hei_export.tsv', encoding='utf-8', header=0, sep='\t', dtype={'fundref': str, 'orgref': str}, na_filter=False)
-organization
-```
-
-
-
-
-
-
-
-
-
-
- ror
- name
- label_en
- label_fr
- label_de
- label_it
- website
- country
- starting_year
- is_funder
- acronym
- aliases
- isni
- fundref
- orgref
- wikidata
- grid
-
-
-
-
- 0
- https://ror.org/032ymzc07
- University of Applied Sciences of the Grisons
-
-
- Fachhochschule Graubünden
-
- https://www.fhgr.ch/en/
- 215
- 1963
- 0
-
- Hochschule für Technik und Wirtschaft Chur
- 0000 0000 8718 2812
-
-
- Q1622220
- grid.460104.7
-
-
- 1
- https://ror.org/04mq2g308
- University of Applied Sciences and Arts Northw...
-
-
-
-
- http://www.fhnw.ch/homepage
- 215
- 2006
- 0
- FHNW
- Fachhochschule Nordwestschweiz
- 0000 0001 1497 8091
-
-
-
- grid.410380.e
-
-
- 2
- https://ror.org/0210tb741
- Forschungsinstitut für biologischen Landbau (F...
-
-
-
-
- https://www.fibl.org/en/germany/location-de.html
- 215
-
- 0
- FiBL
-
-
-
-
-
- grid.506220.3
-
-
- 3
- https://ror.org/007ygn379
- Graduate Institute of International and Develo...
-
- Institut de Hautes études Internationales et d...
- Hochschulinstitut für internationale Studien u...
-
- http://graduateinstitute.ch/home.html
- 215
- 1927
- 0
- IHEID
- Graduate Institute Geneva
- 0000 0001 2296 9873
-
- 14744053
- Q691686
- grid.424404.2
-
-
- 4
- https://ror.org/01xkakk17
- University of Applied Sciences and Arts Wester...
-
- Haute École Spécialisée de Suisse Occidentale
- Fachhochschule Westschweiz
-
- http://www.hes-so.ch/en/homepage-hes-so-1679.html
- 215
- 1998
- 0
- HES-SO
-
- 0000 0001 0943 1999
-
- 10128956
- Q168003
- grid.5681.a
-
-
- 5
- https://ror.org/015pmkr43
- Haute École Pédagogique BEJUNE (HEP BEJUNE)
-
-
-
-
- http://www.hep-bejune.ch/
- 215
- 2001
- 0
- HEP BEJUNE
-
- 0000 0001 0658 3479
-
-
-
- grid.469449.2
-
-
- 6
- https://ror.org/048gre751
- Haute École Pédagogique Fribourg (HEP-PH FR)
-
-
-
-
- https://www.hepfr.ch/
- 215
- 1990
- 0
- HEP-PH FR
-
- 0000 0001 0266 4909
-
-
-
- grid.469451.b
-
-
- 7
- https://ror.org/01bvm0h13
- Haute École Pédagogique du Canton de Vaud (HEP...
-
-
-
-
- http://www.hepl.ch/cms/accueil.html
- 215
- 2001
- 0
- HEP Vaud
-
- 0000 0004 0613 4050
-
-
-
- grid.466224.0
-
-
- 8
- https://ror.org/02ejkey04
- Zurich University of Applied Sciences in Busin...
-
-
- Hochschule für Wirtschaft Zürich
-
- http://www.fh-hwz.ch/en
- 215
- 1986
- 0
- HWZ
-
- 0000 0001 0008 3713
-
- 30805829
- Q1488771
- grid.449909.9
-
-
- 9
- https://ror.org/04nd0xd48
- Lucerne University of Applied Sciences and Arts
-
- Haute École de lucerne
- Hochschule Luzern
-
- https://www.hslu.ch/en/
- 215
- 1997
- 0
-
-
- 0000 0001 2191 8943
-
- 19480920
- Q664028
- grid.425064.1
-
-
- 10
- https://ror.org/00w9q2c06
- University of Applied Sciences of Special Need...
-
-
- Interkantonale Hochschule für Heilpädagogik
-
- http://www.hfh.ch/en/
- 215
- 1924
- 0
- HfH
- Zurich Training College for Teachers of Specia...
- 0000 0001 0710 6332
-
-
-
- grid.466279.8
-
-
- 11
- https://ror.org/049c2kr37
- Kalaidos University of Applied Sciences (Kalai...
-
-
- Kalaidos Fachhochschule
-
- https://www.kalaidos-fh.ch/de-CH
- 215
- 1995
- 0
- Kalaidos UAS
-
- 0000 0004 0453 9054
-
- 6746630
- Q681372
- grid.449532.d
-
-
- 12
- https://ror.org/021f7p178
- Lib4RI - Library for the Research Institutes w...
-
-
-
-
- http://www.lib4ri.ch/
- 215
- 2011
- 0
-
- Lib4RI
- 0000 0004 0624 8541
-
-
- Q1278450
- grid.458352.d
-
-
- 13
- https://ror.org/00p9jf779
- Medicines for Malaria Venture (MMV)
-
-
-
-
- http://www.mmv.org/
- 215
- 1999
- 0
- MMV
-
- 0000 0004 0432 5267
- 501100004167
-
- Q6806774
- grid.452605.0
-
-
- 14
- https://ror.org/038mj2660
- Ostschweizer Fachhochschule OST
- Eastern Switzerland University of Applied Scie...
-
-
-
- https://www.ost.ch/
- 215
- 1999
- 0
-
-
-
-
-
-
- grid.510272.3
-
-
- 15
- https://ror.org/01awgk221
- Zurich University of Teacher Education (PHZH)
-
-
- Pädagogische Hochschule Zürich
-
- https://phzh.ch/en/
- 215
- 2002
- 0
- PHZH
- PH Zürich
- 0000 0000 9666 1858
-
-
-
- grid.483054.e
-
-
- 16
- https://ror.org/05jf1ma54
- Pädagogische Hochschule Bern
- Bern University of Teacher Education
-
-
-
- https://www.phbern.ch
- 215
- 2005
- 0
-
- PHBern
- 0000 0000 8585 5665
-
-
-
- grid.454333.6
-
-
- 17
- https://ror.org/02fjgft97
- Pädagogische Hochschule Graubünden (PHGR)
-
-
-
- Alta scuola pedagogica dei Grigioni
- http://www.phgr.ch/
- 215
-
- 0
- PHGR
-
- 0000 0000 9317 283X
-
-
-
- grid.469478.0
-
-
- 18
- https://ror.org/0235ynq74
- University of Teacher Education Lucerne
-
-
- Pädagogische Hochschule Luzern
-
- http://www.phlu.ch/ute-lucerne/
- 215
- 2003
- 0
-
- PH Luzern
- 0000 0001 0348 1637
-
-
-
- grid.465965.d
-
-
- 19
- https://ror.org/03fs41j10
- Pädagogische Hochschule Schaffhausen (PHSH)
-
-
-
-
- http://www.phsh.ch/
- 215
- 2003
- 0
- PHSH
-
- 0000 0004 0450 7546
-
-
-
- grid.466133.5
-
-
- 20
- https://ror.org/00rqdn375
- Schwyz University of Teacher Education (PHSZ)
-
-
- Pädagogische Hochschule Schwyz
-
- https://www.phsz.ch/en/
- 215
-
- 0
- PHSZ
- PHZ Schwyz
- 0000 0004 0613 7454
-
-
-
- grid.466169.a
-
-
- 21
- https://ror.org/05m37v666
- St.Gallen University of Teacher Education (PHSG)
-
-
- Pädagogische Hochschule St. Gallen
-
- https://www.phsg.ch/en
- 215
- 2007
- 0
- PHSG
-
- 0000 0001 0271 5139
-
-
- Q1768652
- grid.466208.e
-
-
- 22
- https://ror.org/04bf6dq94
- Pädagogische Hochschule Thurgau (PHTG)
-
-
-
-
- http://www.phtg.ch/home/
- 215
- 2003
- 0
- PHTG
-
- 0000 0004 0613 3824
-
-
-
- grid.466322.7
-
-
- 23
- https://ror.org/05a28rw58
- ETH Zurich (ETH Zurich)
-
- École Polytechnique Fédérale de Zurich
- Eidgenössische Technische Hochschule Zürich
- Politecnico federale di Zurigo
- https://www.ethz.ch/en.html
- 215
- 1855
- 0
- ETH Zurich
- Swiss Federal Institute of Technology in Zuric...
- 0000 0001 2156 2780
- 501100003006
- 210910
- Q11942
- grid.5801.c
-
-
- 24
- https://ror.org/02s376052
- École Polytechnique Fédérale de Lausanne (EPFL)
- Swiss Federal Institute of Technology in Lausanne
-
-
-
- http://www.epfl.ch/index.en.html
- 215
- 1853
- 0
- EPFL
-
- 0000000121839049
- 501100001703
- 71968
- Q262760
- grid.5333.6
-
-
- 25
- https://ror.org/00zg4za48
- Swiss Federal Institute for Vocational Educati...
-
- Institut Fédéral des Hautes Études en Formatio...
- Eidgenössisches Hochschulinstitut für Berufsbi...
-
- http://www.ehb-schweiz.ch/en/
- 215
- 2007
- 0
- SFIVET
-
- 0000 0001 2285 5681
-
-
- Q1302632
- grid.466173.1
-
-
- 26
- https://ror.org/01ggx4157
- European Organization for Nuclear Research (CERN)
-
- Organisation européenne pour la recherche nucl...
- Europäische Organisation für Kernforschung
-
- http://home.web.cern.ch/
- 215
- 1954
- 0
- CERN
-
- 0000 0001 2156 142X
-
- 37351
- Q42944
- grid.9132.9
-
-
- 27
- https://ror.org/02bnkt322
- Bern University of Applied Sciences (BFH)
-
- Haute école spécialisée bernoise
- Berner Fachhochschule
-
- http://www.bfh.ch/en/home.html
- 215
- 1997
- 0
- BFH
-
- 0000 0001 0688 6779
- 501100006259
- 4365265
- Q466455
- grid.424060.4
-
-
- 28
- https://ror.org/04d8ztx87
- Agroscope
-
-
-
-
- https://www.agroscope.admin.ch/agroscope/en/ho...
- 215
- 1850
- 0
-
-
- 0000 0004 4681 910X
-
-
- Q397466
- grid.417771.3
-
-
- 29
- https://ror.org/02crff812
- University of Zurich (UZH)
-
- Université de zurich
- Universität Zürich
- Università di Zurigo
- http://www.uzh.ch/index_en.html
- 215
- 1833
- 0
- UZH
-
- 0000 0004 1937 0650
- 501100006447
- 314803
- Q206702
- grid.7400.3
-
-
- 30
- https://ror.org/022fs9h90
- University of Fribourg
-
- Université de Fribourg
- Universität Freiburg
- Università di Friburgo
- http://www.unifr.ch/home/welcomeE.php
- 215
- 1889
- 0
-
-
- 0000 0004 0478 1713
- 501100005869
- 535267
- Q36188
- grid.8534.a
-
-
- 31
- https://ror.org/01swzsf04
- University of Geneva (UNIGE)
-
- Université de Genève
-
- Università di Ginevra
- https://www.unige.ch/
- 215
- 1559
- 0
- UNIGE
- Schola Genevensis
- 0000 0001 2322 4988
- 501100006389
- 342348
- Q503473
- grid.8591.5
-
-
- 32
- https://ror.org/019whta54
- University of Lausanne (UNIL)
-
- Université de Lausanne
- Universität Lausanne
- Università di Losanna
- http://www.unil.ch/central/en/home.html
- 215
- 1537
- 0
- UNIL
- Schola Lausannensis
- 0000 0001 2165 4204
- 501100006390
- 79810
- Q658975
- grid.9851.5
-
-
- 33
- https://ror.org/00vasag41
- University of Neuchâtel
-
- Université de neuchâtel
- Universität Neuenburg
-
- http://www2.unine.ch/
- 215
- 1838
- 0
-
-
- 0000 0001 2297 7718
- 501100005353
- 3662101
- Q541548
- grid.10711.36
-
-
- 34
- https://ror.org/05r0ap620
- Zurich University of the Arts
-
- Haute École d'Art de Zurich
- Zürcher Hochschule der Künste
-
- https://www.zhdk.ch/
- 215
- 2007
- 0
-
-
-
-
- 39250592
- Q222450
- grid.449912.3
-
-
- 35
- https://ror.org/05pmsvm27
- Zurich University of Applied Sciences (ZHAW)
-
-
- Zürcher Hochschule für Angewandte Wissenschaften
-
- https://www.zhaw.ch/en/university/
- 215
- 2007
- 0
- ZHAW
-
- 0000000122291644
-
- 30930550
- Q2605554
- grid.19739.35
-
-
- 36
- https://ror.org/05ghhx264
- University of Teacher Education Zug (PH Zug)
-
-
- Pädagogische Hochschule Zug
-
- https://www.zg.ch/behoerden/direktion-fur-bild...
- 215
- 2013
- 0
- PH Zug
-
- 0000 0004 0449 2225
-
-
-
- grid.466274.5
-
-
- 37
- https://ror.org/03mcsbr76
- Swiss Ornithological Institute
-
-
- Schweizerische Vogelwarte
-
- http://www.vogelwarte.ch/de/home/
- 215
- 1924
- 0
-
-
- 0000 0001 1512 3677
-
-
- Q663638
- grid.419767.a
-
-
- 38
- https://ror.org/05ep8g269
- University of Applied Sciences and Arts of Sou...
-
-
-
- Scuola Universitaria Professionale della Svizz...
- http://www.supsi.ch/home_en.html
- 215
- 1997
- 0
- SUPSI
-
- 0000000123252233
-
- 34066841
- Q663984
- grid.16058.3a
-
-
- 39
- https://ror.org/03c4atk17
- Universita della Svizzera Italiana (USI)
- University of Italian Switzerland
- Université de la suisse italienne
-
- Università della Svizzera italiana
- http://www.usi.ch/en/index.htm
- 215
- 1996
- 0
- USI
-
- 0000 0001 2203 2861
-
- 2290642
- Q689617
- grid.29078.34
-
-
- 40
- https://ror.org/02s6k3f65
- University of Basel
-
- Université de bâle
- Universität Basel
- Università di Basilea
- https://www.unibas.ch/de
- 215
- 1460
- 0
-
-
- 0000 0004 1937 0642
- 100008375
- 427614
- Q372608
- grid.6612.3
-
-
- 41
- https://ror.org/02k7v4d05
- University of Bern (UB)
-
- Université de Berne
- Universität Bern
- Università di Berna
- http://www.unibe.ch/eng/
- 215
- 1834
- 0
- UB
-
- 0000 0001 0726 5157
- 100009068
- 1157515
- Q659080
- grid.5734.5
-
-
- 42
- https://ror.org/01qjrx392
- University of Liechtenstein
-
-
- Universität Liechtenstein
-
- https://www.uni.li/study/de/
- 128
- 1961
- 0
-
-
- 0000 0001 2227 4668
-
- 10554064
- Q974328
- grid.445905.9
-
-
- 43
- https://ror.org/00kgrkn83
- University of Lucerne (UNILU)
-
- Université de lucerne
- Universität Luzern
- Università di Lucerna
- http://www.unilu.ch/
- 215
- 2000
- 0
- UNILU
-
- 0000 0001 1456 7938
-
- 21004764
- Q673308
- grid.449852.6
-
-
- 44
- https://ror.org/0561a3s31
- University of St. Gallen (HSG)
-
- Université de saint-gall
- Universität St. Gallen
- Università di San Gallo
- http://www.es.unisg.ch/en/
- 215
- 1898
- 0
- HSG
-
- 0000 0001 2156 6618
- 100009572
- 751473
- Q673354
- grid.15775.31
-
-
- 45
- https://ror.org/040gs8e06
- Pädagogische Hochschule Wallis (PH-VS)
-
- Haute École Pédagogique du Valais
-
-
- http://www.hepvs.ch/de
- 215
- 2000
- 0
- PH-VS
-
- 0000 0001 2178 3217
-
-
-
- grid.466216.1
-
-
-
-
-
-
-
-
-```python
-# tri par nom
-organization = organization.sort_values(by='name')
-organization
-```
-
-
-
-
-
-
-
-
-
-
- ror
- name
- label_en
- label_fr
- label_de
- label_it
- website
- country
- starting_year
- is_funder
- acronym
- aliases
- isni
- fundref
- orgref
- wikidata
- grid
-
-
-
-
- 28
- https://ror.org/04d8ztx87
- Agroscope
-
-
-
-
- https://www.agroscope.admin.ch/agroscope/en/ho...
- 215
- 1850
- 0
-
-
- 0000 0004 4681 910X
-
-
- Q397466
- grid.417771.3
-
-
- 27
- https://ror.org/02bnkt322
- Bern University of Applied Sciences (BFH)
-
- Haute école spécialisée bernoise
- Berner Fachhochschule
-
- http://www.bfh.ch/en/home.html
- 215
- 1997
- 0
- BFH
-
- 0000 0001 0688 6779
- 501100006259
- 4365265
- Q466455
- grid.424060.4
-
-
- 23
- https://ror.org/05a28rw58
- ETH Zurich (ETH Zurich)
-
- École Polytechnique Fédérale de Zurich
- Eidgenössische Technische Hochschule Zürich
- Politecnico federale di Zurigo
- https://www.ethz.ch/en.html
- 215
- 1855
- 0
- ETH Zurich
- Swiss Federal Institute of Technology in Zuric...
- 0000 0001 2156 2780
- 501100003006
- 210910
- Q11942
- grid.5801.c
-
-
- 26
- https://ror.org/01ggx4157
- European Organization for Nuclear Research (CERN)
-
- Organisation européenne pour la recherche nucl...
- Europäische Organisation für Kernforschung
-
- http://home.web.cern.ch/
- 215
- 1954
- 0
- CERN
-
- 0000 0001 2156 142X
-
- 37351
- Q42944
- grid.9132.9
-
-
- 2
- https://ror.org/0210tb741
- Forschungsinstitut für biologischen Landbau (F...
-
-
-
-
- https://www.fibl.org/en/germany/location-de.html
- 215
-
- 0
- FiBL
-
-
-
-
-
- grid.506220.3
-
-
- 3
- https://ror.org/007ygn379
- Graduate Institute of International and Develo...
-
- Institut de Hautes études Internationales et d...
- Hochschulinstitut für internationale Studien u...
-
- http://graduateinstitute.ch/home.html
- 215
- 1927
- 0
- IHEID
- Graduate Institute Geneva
- 0000 0001 2296 9873
-
- 14744053
- Q691686
- grid.424404.2
-
-
- 5
- https://ror.org/015pmkr43
- Haute École Pédagogique BEJUNE (HEP BEJUNE)
-
-
-
-
- http://www.hep-bejune.ch/
- 215
- 2001
- 0
- HEP BEJUNE
-
- 0000 0001 0658 3479
-
-
-
- grid.469449.2
-
-
- 6
- https://ror.org/048gre751
- Haute École Pédagogique Fribourg (HEP-PH FR)
-
-
-
-
- https://www.hepfr.ch/
- 215
- 1990
- 0
- HEP-PH FR
-
- 0000 0001 0266 4909
-
-
-
- grid.469451.b
-
-
- 7
- https://ror.org/01bvm0h13
- Haute École Pédagogique du Canton de Vaud (HEP...
-
-
-
-
- http://www.hepl.ch/cms/accueil.html
- 215
- 2001
- 0
- HEP Vaud
-
- 0000 0004 0613 4050
-
-
-
- grid.466224.0
-
-
- 11
- https://ror.org/049c2kr37
- Kalaidos University of Applied Sciences (Kalai...
-
-
- Kalaidos Fachhochschule
-
- https://www.kalaidos-fh.ch/de-CH
- 215
- 1995
- 0
- Kalaidos UAS
-
- 0000 0004 0453 9054
-
- 6746630
- Q681372
- grid.449532.d
-
-
- 12
- https://ror.org/021f7p178
- Lib4RI - Library for the Research Institutes w...
-
-
-
-
- http://www.lib4ri.ch/
- 215
- 2011
- 0
-
- Lib4RI
- 0000 0004 0624 8541
-
-
- Q1278450
- grid.458352.d
-
-
- 9
- https://ror.org/04nd0xd48
- Lucerne University of Applied Sciences and Arts
-
- Haute École de lucerne
- Hochschule Luzern
-
- https://www.hslu.ch/en/
- 215
- 1997
- 0
-
-
- 0000 0001 2191 8943
-
- 19480920
- Q664028
- grid.425064.1
-
-
- 13
- https://ror.org/00p9jf779
- Medicines for Malaria Venture (MMV)
-
-
-
-
- http://www.mmv.org/
- 215
- 1999
- 0
- MMV
-
- 0000 0004 0432 5267
- 501100004167
-
- Q6806774
- grid.452605.0
-
-
- 14
- https://ror.org/038mj2660
- Ostschweizer Fachhochschule OST
- Eastern Switzerland University of Applied Scie...
-
-
-
- https://www.ost.ch/
- 215
- 1999
- 0
-
-
-
-
-
-
- grid.510272.3
-
-
- 16
- https://ror.org/05jf1ma54
- Pädagogische Hochschule Bern
- Bern University of Teacher Education
-
-
-
- https://www.phbern.ch
- 215
- 2005
- 0
-
- PHBern
- 0000 0000 8585 5665
-
-
-
- grid.454333.6
-
-
- 17
- https://ror.org/02fjgft97
- Pädagogische Hochschule Graubünden (PHGR)
-
-
-
- Alta scuola pedagogica dei Grigioni
- http://www.phgr.ch/
- 215
-
- 0
- PHGR
-
- 0000 0000 9317 283X
-
-
-
- grid.469478.0
-
-
- 19
- https://ror.org/03fs41j10
- Pädagogische Hochschule Schaffhausen (PHSH)
-
-
-
-
- http://www.phsh.ch/
- 215
- 2003
- 0
- PHSH
-
- 0000 0004 0450 7546
-
-
-
- grid.466133.5
-
-
- 22
- https://ror.org/04bf6dq94
- Pädagogische Hochschule Thurgau (PHTG)
-
-
-
-
- http://www.phtg.ch/home/
- 215
- 2003
- 0
- PHTG
-
- 0000 0004 0613 3824
-
-
-
- grid.466322.7
-
-
- 45
- https://ror.org/040gs8e06
- Pädagogische Hochschule Wallis (PH-VS)
-
- Haute École Pédagogique du Valais
-
-
- http://www.hepvs.ch/de
- 215
- 2000
- 0
- PH-VS
-
- 0000 0001 2178 3217
-
-
-
- grid.466216.1
-
-
- 20
- https://ror.org/00rqdn375
- Schwyz University of Teacher Education (PHSZ)
-
-
- Pädagogische Hochschule Schwyz
-
- https://www.phsz.ch/en/
- 215
-
- 0
- PHSZ
- PHZ Schwyz
- 0000 0004 0613 7454
-
-
-
- grid.466169.a
-
-
- 21
- https://ror.org/05m37v666
- St.Gallen University of Teacher Education (PHSG)
-
-
- Pädagogische Hochschule St. Gallen
-
- https://www.phsg.ch/en
- 215
- 2007
- 0
- PHSG
-
- 0000 0001 0271 5139
-
-
- Q1768652
- grid.466208.e
-
-
- 25
- https://ror.org/00zg4za48
- Swiss Federal Institute for Vocational Educati...
-
- Institut Fédéral des Hautes Études en Formatio...
- Eidgenössisches Hochschulinstitut für Berufsbi...
-
- http://www.ehb-schweiz.ch/en/
- 215
- 2007
- 0
- SFIVET
-
- 0000 0001 2285 5681
-
-
- Q1302632
- grid.466173.1
-
-
- 37
- https://ror.org/03mcsbr76
- Swiss Ornithological Institute
-
-
- Schweizerische Vogelwarte
-
- http://www.vogelwarte.ch/de/home/
- 215
- 1924
- 0
-
-
- 0000 0001 1512 3677
-
-
- Q663638
- grid.419767.a
-
-
- 39
- https://ror.org/03c4atk17
- Universita della Svizzera Italiana (USI)
- University of Italian Switzerland
- Université de la suisse italienne
-
- Università della Svizzera italiana
- http://www.usi.ch/en/index.htm
- 215
- 1996
- 0
- USI
-
- 0000 0001 2203 2861
-
- 2290642
- Q689617
- grid.29078.34
-
-
- 1
- https://ror.org/04mq2g308
- University of Applied Sciences and Arts Northw...
-
-
-
-
- http://www.fhnw.ch/homepage
- 215
- 2006
- 0
- FHNW
- Fachhochschule Nordwestschweiz
- 0000 0001 1497 8091
-
-
-
- grid.410380.e
-
-
- 4
- https://ror.org/01xkakk17
- University of Applied Sciences and Arts Wester...
-
- Haute École Spécialisée de Suisse Occidentale
- Fachhochschule Westschweiz
-
- http://www.hes-so.ch/en/homepage-hes-so-1679.html
- 215
- 1998
- 0
- HES-SO
-
- 0000 0001 0943 1999
-
- 10128956
- Q168003
- grid.5681.a
-
-
- 38
- https://ror.org/05ep8g269
- University of Applied Sciences and Arts of Sou...
-
-
-
- Scuola Universitaria Professionale della Svizz...
- http://www.supsi.ch/home_en.html
- 215
- 1997
- 0
- SUPSI
-
- 0000000123252233
-
- 34066841
- Q663984
- grid.16058.3a
-
-
- 10
- https://ror.org/00w9q2c06
- University of Applied Sciences of Special Need...
-
-
- Interkantonale Hochschule für Heilpädagogik
-
- http://www.hfh.ch/en/
- 215
- 1924
- 0
- HfH
- Zurich Training College for Teachers of Specia...
- 0000 0001 0710 6332
-
-
-
- grid.466279.8
-
-
- 0
- https://ror.org/032ymzc07
- University of Applied Sciences of the Grisons
-
-
- Fachhochschule Graubünden
-
- https://www.fhgr.ch/en/
- 215
- 1963
- 0
-
- Hochschule für Technik und Wirtschaft Chur
- 0000 0000 8718 2812
-
-
- Q1622220
- grid.460104.7
-
-
- 40
- https://ror.org/02s6k3f65
- University of Basel
-
- Université de bâle
- Universität Basel
- Università di Basilea
- https://www.unibas.ch/de
- 215
- 1460
- 0
-
-
- 0000 0004 1937 0642
- 100008375
- 427614
- Q372608
- grid.6612.3
-
-
- 41
- https://ror.org/02k7v4d05
- University of Bern (UB)
-
- Université de Berne
- Universität Bern
- Università di Berna
- http://www.unibe.ch/eng/
- 215
- 1834
- 0
- UB
-
- 0000 0001 0726 5157
- 100009068
- 1157515
- Q659080
- grid.5734.5
-
-
- 30
- https://ror.org/022fs9h90
- University of Fribourg
-
- Université de Fribourg
- Universität Freiburg
- Università di Friburgo
- http://www.unifr.ch/home/welcomeE.php
- 215
- 1889
- 0
-
-
- 0000 0004 0478 1713
- 501100005869
- 535267
- Q36188
- grid.8534.a
-
-
- 31
- https://ror.org/01swzsf04
- University of Geneva (UNIGE)
-
- Université de Genève
-
- Università di Ginevra
- https://www.unige.ch/
- 215
- 1559
- 0
- UNIGE
- Schola Genevensis
- 0000 0001 2322 4988
- 501100006389
- 342348
- Q503473
- grid.8591.5
-
-
- 32
- https://ror.org/019whta54
- University of Lausanne (UNIL)
-
- Université de Lausanne
- Universität Lausanne
- Università di Losanna
- http://www.unil.ch/central/en/home.html
- 215
- 1537
- 0
- UNIL
- Schola Lausannensis
- 0000 0001 2165 4204
- 501100006390
- 79810
- Q658975
- grid.9851.5
-
-
- 42
- https://ror.org/01qjrx392
- University of Liechtenstein
-
-
- Universität Liechtenstein
-
- https://www.uni.li/study/de/
- 128
- 1961
- 0
-
-
- 0000 0001 2227 4668
-
- 10554064
- Q974328
- grid.445905.9
-
-
- 43
- https://ror.org/00kgrkn83
- University of Lucerne (UNILU)
-
- Université de lucerne
- Universität Luzern
- Università di Lucerna
- http://www.unilu.ch/
- 215
- 2000
- 0
- UNILU
-
- 0000 0001 1456 7938
-
- 21004764
- Q673308
- grid.449852.6
-
-
- 33
- https://ror.org/00vasag41
- University of Neuchâtel
-
- Université de neuchâtel
- Universität Neuenburg
-
- http://www2.unine.ch/
- 215
- 1838
- 0
-
-
- 0000 0001 2297 7718
- 501100005353
- 3662101
- Q541548
- grid.10711.36
-
-
- 44
- https://ror.org/0561a3s31
- University of St. Gallen (HSG)
-
- Université de saint-gall
- Universität St. Gallen
- Università di San Gallo
- http://www.es.unisg.ch/en/
- 215
- 1898
- 0
- HSG
-
- 0000 0001 2156 6618
- 100009572
- 751473
- Q673354
- grid.15775.31
-
-
- 18
- https://ror.org/0235ynq74
- University of Teacher Education Lucerne
-
-
- Pädagogische Hochschule Luzern
-
- http://www.phlu.ch/ute-lucerne/
- 215
- 2003
- 0
-
- PH Luzern
- 0000 0001 0348 1637
-
-
-
- grid.465965.d
-
-
- 36
- https://ror.org/05ghhx264
- University of Teacher Education Zug (PH Zug)
-
-
- Pädagogische Hochschule Zug
-
- https://www.zg.ch/behoerden/direktion-fur-bild...
- 215
- 2013
- 0
- PH Zug
-
- 0000 0004 0449 2225
-
-
-
- grid.466274.5
-
-
- 29
- https://ror.org/02crff812
- University of Zurich (UZH)
-
- Université de zurich
- Universität Zürich
- Università di Zurigo
- http://www.uzh.ch/index_en.html
- 215
- 1833
- 0
- UZH
-
- 0000 0004 1937 0650
- 501100006447
- 314803
- Q206702
- grid.7400.3
-
-
- 35
- https://ror.org/05pmsvm27
- Zurich University of Applied Sciences (ZHAW)
-
-
- Zürcher Hochschule für Angewandte Wissenschaften
-
- https://www.zhaw.ch/en/university/
- 215
- 2007
- 0
- ZHAW
-
- 0000000122291644
-
- 30930550
- Q2605554
- grid.19739.35
-
-
- 8
- https://ror.org/02ejkey04
- Zurich University of Applied Sciences in Busin...
-
-
- Hochschule für Wirtschaft Zürich
-
- http://www.fh-hwz.ch/en
- 215
- 1986
- 0
- HWZ
-
- 0000 0001 0008 3713
-
- 30805829
- Q1488771
- grid.449909.9
-
-
- 15
- https://ror.org/01awgk221
- Zurich University of Teacher Education (PHZH)
-
-
- Pädagogische Hochschule Zürich
-
- https://phzh.ch/en/
- 215
- 2002
- 0
- PHZH
- PH Zürich
- 0000 0000 9666 1858
-
-
-
- grid.483054.e
-
-
- 34
- https://ror.org/05r0ap620
- Zurich University of the Arts
-
- Haute École d'Art de Zurich
- Zürcher Hochschule der Künste
-
- https://www.zhdk.ch/
- 215
- 2007
- 0
-
-
-
-
- 39250592
- Q222450
- grid.449912.3
-
-
- 24
- https://ror.org/02s376052
- École Polytechnique Fédérale de Lausanne (EPFL)
- Swiss Federal Institute of Technology in Lausanne
-
-
-
- http://www.epfl.ch/index.en.html
- 215
- 1853
- 0
- EPFL
-
- 0000000121839049
- 501100001703
- 71968
- Q262760
- grid.5333.6
-
-
-
-
-
-
-
-
-```python
-organization = organization.reset_index(drop=True)
-organization
-```
-
-
-
-
-
-
-
-
-
-
- ror
- name
- label_en
- label_fr
- label_de
- label_it
- website
- country
- starting_year
- is_funder
- acronym
- aliases
- isni
- fundref
- orgref
- wikidata
- grid
-
-
-
-
- 0
- https://ror.org/04d8ztx87
- Agroscope
-
-
-
-
- https://www.agroscope.admin.ch/agroscope/en/ho...
- 215
- 1850
- 0
-
-
- 0000 0004 4681 910X
-
-
- Q397466
- grid.417771.3
-
-
- 1
- https://ror.org/02bnkt322
- Bern University of Applied Sciences (BFH)
-
- Haute école spécialisée bernoise
- Berner Fachhochschule
-
- http://www.bfh.ch/en/home.html
- 215
- 1997
- 0
- BFH
-
- 0000 0001 0688 6779
- 501100006259
- 4365265
- Q466455
- grid.424060.4
-
-
- 2
- https://ror.org/05a28rw58
- ETH Zurich (ETH Zurich)
-
- École Polytechnique Fédérale de Zurich
- Eidgenössische Technische Hochschule Zürich
- Politecnico federale di Zurigo
- https://www.ethz.ch/en.html
- 215
- 1855
- 0
- ETH Zurich
- Swiss Federal Institute of Technology in Zuric...
- 0000 0001 2156 2780
- 501100003006
- 210910
- Q11942
- grid.5801.c
-
-
- 3
- https://ror.org/01ggx4157
- European Organization for Nuclear Research (CERN)
-
- Organisation européenne pour la recherche nucl...
- Europäische Organisation für Kernforschung
-
- http://home.web.cern.ch/
- 215
- 1954
- 0
- CERN
-
- 0000 0001 2156 142X
-
- 37351
- Q42944
- grid.9132.9
-
-
- 4
- https://ror.org/0210tb741
- Forschungsinstitut für biologischen Landbau (F...
-
-
-
-
- https://www.fibl.org/en/germany/location-de.html
- 215
-
- 0
- FiBL
-
-
-
-
-
- grid.506220.3
-
-
- 5
- https://ror.org/007ygn379
- Graduate Institute of International and Develo...
-
- Institut de Hautes études Internationales et d...
- Hochschulinstitut für internationale Studien u...
-
- http://graduateinstitute.ch/home.html
- 215
- 1927
- 0
- IHEID
- Graduate Institute Geneva
- 0000 0001 2296 9873
-
- 14744053
- Q691686
- grid.424404.2
-
-
- 6
- https://ror.org/015pmkr43
- Haute École Pédagogique BEJUNE (HEP BEJUNE)
-
-
-
-
- http://www.hep-bejune.ch/
- 215
- 2001
- 0
- HEP BEJUNE
-
- 0000 0001 0658 3479
-
-
-
- grid.469449.2
-
-
- 7
- https://ror.org/048gre751
- Haute École Pédagogique Fribourg (HEP-PH FR)
-
-
-
-
- https://www.hepfr.ch/
- 215
- 1990
- 0
- HEP-PH FR
-
- 0000 0001 0266 4909
-
-
-
- grid.469451.b
-
-
- 8
- https://ror.org/01bvm0h13
- Haute École Pédagogique du Canton de Vaud (HEP...
-
-
-
-
- http://www.hepl.ch/cms/accueil.html
- 215
- 2001
- 0
- HEP Vaud
-
- 0000 0004 0613 4050
-
-
-
- grid.466224.0
-
-
- 9
- https://ror.org/049c2kr37
- Kalaidos University of Applied Sciences (Kalai...
-
-
- Kalaidos Fachhochschule
-
- https://www.kalaidos-fh.ch/de-CH
- 215
- 1995
- 0
- Kalaidos UAS
-
- 0000 0004 0453 9054
-
- 6746630
- Q681372
- grid.449532.d
-
-
- 10
- https://ror.org/021f7p178
- Lib4RI - Library for the Research Institutes w...
-
-
-
-
- http://www.lib4ri.ch/
- 215
- 2011
- 0
-
- Lib4RI
- 0000 0004 0624 8541
-
-
- Q1278450
- grid.458352.d
-
-
- 11
- https://ror.org/04nd0xd48
- Lucerne University of Applied Sciences and Arts
-
- Haute École de lucerne
- Hochschule Luzern
-
- https://www.hslu.ch/en/
- 215
- 1997
- 0
-
-
- 0000 0001 2191 8943
-
- 19480920
- Q664028
- grid.425064.1
-
-
- 12
- https://ror.org/00p9jf779
- Medicines for Malaria Venture (MMV)
-
-
-
-
- http://www.mmv.org/
- 215
- 1999
- 0
- MMV
-
- 0000 0004 0432 5267
- 501100004167
-
- Q6806774
- grid.452605.0
-
-
- 13
- https://ror.org/038mj2660
- Ostschweizer Fachhochschule OST
- Eastern Switzerland University of Applied Scie...
-
-
-
- https://www.ost.ch/
- 215
- 1999
- 0
-
-
-
-
-
-
- grid.510272.3
-
-
- 14
- https://ror.org/05jf1ma54
- Pädagogische Hochschule Bern
- Bern University of Teacher Education
-
-
-
- https://www.phbern.ch
- 215
- 2005
- 0
-
- PHBern
- 0000 0000 8585 5665
-
-
-
- grid.454333.6
-
-
- 15
- https://ror.org/02fjgft97
- Pädagogische Hochschule Graubünden (PHGR)
-
-
-
- Alta scuola pedagogica dei Grigioni
- http://www.phgr.ch/
- 215
-
- 0
- PHGR
-
- 0000 0000 9317 283X
-
-
-
- grid.469478.0
-
-
- 16
- https://ror.org/03fs41j10
- Pädagogische Hochschule Schaffhausen (PHSH)
-
-
-
-
- http://www.phsh.ch/
- 215
- 2003
- 0
- PHSH
-
- 0000 0004 0450 7546
-
-
-
- grid.466133.5
-
-
- 17
- https://ror.org/04bf6dq94
- Pädagogische Hochschule Thurgau (PHTG)
-
-
-
-
- http://www.phtg.ch/home/
- 215
- 2003
- 0
- PHTG
-
- 0000 0004 0613 3824
-
-
-
- grid.466322.7
-
-
- 18
- https://ror.org/040gs8e06
- Pädagogische Hochschule Wallis (PH-VS)
-
- Haute École Pédagogique du Valais
-
-
- http://www.hepvs.ch/de
- 215
- 2000
- 0
- PH-VS
-
- 0000 0001 2178 3217
-
-
-
- grid.466216.1
-
-
- 19
- https://ror.org/00rqdn375
- Schwyz University of Teacher Education (PHSZ)
-
-
- Pädagogische Hochschule Schwyz
-
- https://www.phsz.ch/en/
- 215
-
- 0
- PHSZ
- PHZ Schwyz
- 0000 0004 0613 7454
-
-
-
- grid.466169.a
-
-
- 20
- https://ror.org/05m37v666
- St.Gallen University of Teacher Education (PHSG)
-
-
- Pädagogische Hochschule St. Gallen
-
- https://www.phsg.ch/en
- 215
- 2007
- 0
- PHSG
-
- 0000 0001 0271 5139
-
-
- Q1768652
- grid.466208.e
-
-
- 21
- https://ror.org/00zg4za48
- Swiss Federal Institute for Vocational Educati...
-
- Institut Fédéral des Hautes Études en Formatio...
- Eidgenössisches Hochschulinstitut für Berufsbi...
-
- http://www.ehb-schweiz.ch/en/
- 215
- 2007
- 0
- SFIVET
-
- 0000 0001 2285 5681
-
-
- Q1302632
- grid.466173.1
-
-
- 22
- https://ror.org/03mcsbr76
- Swiss Ornithological Institute
-
-
- Schweizerische Vogelwarte
-
- http://www.vogelwarte.ch/de/home/
- 215
- 1924
- 0
-
-
- 0000 0001 1512 3677
-
-
- Q663638
- grid.419767.a
-
-
- 23
- https://ror.org/03c4atk17
- Universita della Svizzera Italiana (USI)
- University of Italian Switzerland
- Université de la suisse italienne
-
- Università della Svizzera italiana
- http://www.usi.ch/en/index.htm
- 215
- 1996
- 0
- USI
-
- 0000 0001 2203 2861
-
- 2290642
- Q689617
- grid.29078.34
-
-
- 24
- https://ror.org/04mq2g308
- University of Applied Sciences and Arts Northw...
-
-
-
-
- http://www.fhnw.ch/homepage
- 215
- 2006
- 0
- FHNW
- Fachhochschule Nordwestschweiz
- 0000 0001 1497 8091
-
-
-
- grid.410380.e
-
-
- 25
- https://ror.org/01xkakk17
- University of Applied Sciences and Arts Wester...
-
- Haute École Spécialisée de Suisse Occidentale
- Fachhochschule Westschweiz
-
- http://www.hes-so.ch/en/homepage-hes-so-1679.html
- 215
- 1998
- 0
- HES-SO
-
- 0000 0001 0943 1999
-
- 10128956
- Q168003
- grid.5681.a
-
-
- 26
- https://ror.org/05ep8g269
- University of Applied Sciences and Arts of Sou...
-
-
-
- Scuola Universitaria Professionale della Svizz...
- http://www.supsi.ch/home_en.html
- 215
- 1997
- 0
- SUPSI
-
- 0000000123252233
-
- 34066841
- Q663984
- grid.16058.3a
-
-
- 27
- https://ror.org/00w9q2c06
- University of Applied Sciences of Special Need...
-
-
- Interkantonale Hochschule für Heilpädagogik
-
- http://www.hfh.ch/en/
- 215
- 1924
- 0
- HfH
- Zurich Training College for Teachers of Specia...
- 0000 0001 0710 6332
-
-
-
- grid.466279.8
-
-
- 28
- https://ror.org/032ymzc07
- University of Applied Sciences of the Grisons
-
-
- Fachhochschule Graubünden
-
- https://www.fhgr.ch/en/
- 215
- 1963
- 0
-
- Hochschule für Technik und Wirtschaft Chur
- 0000 0000 8718 2812
-
-
- Q1622220
- grid.460104.7
-
-
- 29
- https://ror.org/02s6k3f65
- University of Basel
-
- Université de bâle
- Universität Basel
- Università di Basilea
- https://www.unibas.ch/de
- 215
- 1460
- 0
-
-
- 0000 0004 1937 0642
- 100008375
- 427614
- Q372608
- grid.6612.3
-
-
- 30
- https://ror.org/02k7v4d05
- University of Bern (UB)
-
- Université de Berne
- Universität Bern
- Università di Berna
- http://www.unibe.ch/eng/
- 215
- 1834
- 0
- UB
-
- 0000 0001 0726 5157
- 100009068
- 1157515
- Q659080
- grid.5734.5
-
-
- 31
- https://ror.org/022fs9h90
- University of Fribourg
-
- Université de Fribourg
- Universität Freiburg
- Università di Friburgo
- http://www.unifr.ch/home/welcomeE.php
- 215
- 1889
- 0
-
-
- 0000 0004 0478 1713
- 501100005869
- 535267
- Q36188
- grid.8534.a
-
-
- 32
- https://ror.org/01swzsf04
- University of Geneva (UNIGE)
-
- Université de Genève
-
- Università di Ginevra
- https://www.unige.ch/
- 215
- 1559
- 0
- UNIGE
- Schola Genevensis
- 0000 0001 2322 4988
- 501100006389
- 342348
- Q503473
- grid.8591.5
-
-
- 33
- https://ror.org/019whta54
- University of Lausanne (UNIL)
-
- Université de Lausanne
- Universität Lausanne
- Università di Losanna
- http://www.unil.ch/central/en/home.html
- 215
- 1537
- 0
- UNIL
- Schola Lausannensis
- 0000 0001 2165 4204
- 501100006390
- 79810
- Q658975
- grid.9851.5
-
-
- 34
- https://ror.org/01qjrx392
- University of Liechtenstein
-
-
- Universität Liechtenstein
-
- https://www.uni.li/study/de/
- 128
- 1961
- 0
-
-
- 0000 0001 2227 4668
-
- 10554064
- Q974328
- grid.445905.9
-
-
- 35
- https://ror.org/00kgrkn83
- University of Lucerne (UNILU)
-
- Université de lucerne
- Universität Luzern
- Università di Lucerna
- http://www.unilu.ch/
- 215
- 2000
- 0
- UNILU
-
- 0000 0001 1456 7938
-
- 21004764
- Q673308
- grid.449852.6
-
-
- 36
- https://ror.org/00vasag41
- University of Neuchâtel
-
- Université de neuchâtel
- Universität Neuenburg
-
- http://www2.unine.ch/
- 215
- 1838
- 0
-
-
- 0000 0001 2297 7718
- 501100005353
- 3662101
- Q541548
- grid.10711.36
-
-
- 37
- https://ror.org/0561a3s31
- University of St. Gallen (HSG)
-
- Université de saint-gall
- Universität St. Gallen
- Università di San Gallo
- http://www.es.unisg.ch/en/
- 215
- 1898
- 0
- HSG
-
- 0000 0001 2156 6618
- 100009572
- 751473
- Q673354
- grid.15775.31
-
-
- 38
- https://ror.org/0235ynq74
- University of Teacher Education Lucerne
-
-
- Pädagogische Hochschule Luzern
-
- http://www.phlu.ch/ute-lucerne/
- 215
- 2003
- 0
-
- PH Luzern
- 0000 0001 0348 1637
-
-
-
- grid.465965.d
-
-
- 39
- https://ror.org/05ghhx264
- University of Teacher Education Zug (PH Zug)
-
-
- Pädagogische Hochschule Zug
-
- https://www.zg.ch/behoerden/direktion-fur-bild...
- 215
- 2013
- 0
- PH Zug
-
- 0000 0004 0449 2225
-
-
-
- grid.466274.5
-
-
- 40
- https://ror.org/02crff812
- University of Zurich (UZH)
-
- Université de zurich
- Universität Zürich
- Università di Zurigo
- http://www.uzh.ch/index_en.html
- 215
- 1833
- 0
- UZH
-
- 0000 0004 1937 0650
- 501100006447
- 314803
- Q206702
- grid.7400.3
-
-
- 41
- https://ror.org/05pmsvm27
- Zurich University of Applied Sciences (ZHAW)
-
-
- Zürcher Hochschule für Angewandte Wissenschaften
-
- https://www.zhaw.ch/en/university/
- 215
- 2007
- 0
- ZHAW
-
- 0000000122291644
-
- 30930550
- Q2605554
- grid.19739.35
-
-
- 42
- https://ror.org/02ejkey04
- Zurich University of Applied Sciences in Busin...
-
-
- Hochschule für Wirtschaft Zürich
-
- http://www.fh-hwz.ch/en
- 215
- 1986
- 0
- HWZ
-
- 0000 0001 0008 3713
-
- 30805829
- Q1488771
- grid.449909.9
-
-
- 43
- https://ror.org/01awgk221
- Zurich University of Teacher Education (PHZH)
-
-
- Pädagogische Hochschule Zürich
-
- https://phzh.ch/en/
- 215
- 2002
- 0
- PHZH
- PH Zürich
- 0000 0000 9666 1858
-
-
-
- grid.483054.e
-
-
- 44
- https://ror.org/05r0ap620
- Zurich University of the Arts
-
- Haute École d'Art de Zurich
- Zürcher Hochschule der Künste
-
- https://www.zhdk.ch/
- 215
- 2007
- 0
-
-
-
-
- 39250592
- Q222450
- grid.449912.3
-
-
- 45
- https://ror.org/02s376052
- École Polytechnique Fédérale de Lausanne (EPFL)
- Swiss Federal Institute of Technology in Lausanne
-
-
-
- http://www.epfl.ch/index.en.html
- 215
- 1853
- 0
- EPFL
-
- 0000000121839049
- 501100001703
- 71968
- Q262760
- grid.5333.6
-
-
-
-
-
-
-
-
-```python
-# mettre l'EPFL en position 1 et UNIGE en 2
-target_row = 32
-# Move target row to first element of list.
-idx = [target_row] + [i for i in range(len(organization)) if i != target_row]
-organization = organization.iloc[idx]
-organization
-```
-
-
-
-
-
-
-
-
-
-
- ror
- name
- label_en
- label_fr
- label_de
- label_it
- website
- country
- starting_year
- is_funder
- acronym
- aliases
- isni
- fundref
- orgref
- wikidata
- grid
-
-
-
-
- 32
- https://ror.org/01swzsf04
- University of Geneva (UNIGE)
-
- Université de Genève
-
- Università di Ginevra
- https://www.unige.ch/
- 215
- 1559
- 0
- UNIGE
- Schola Genevensis
- 0000 0001 2322 4988
- 501100006389
- 342348
- Q503473
- grid.8591.5
-
-
- 0
- https://ror.org/04d8ztx87
- Agroscope
-
-
-
-
- https://www.agroscope.admin.ch/agroscope/en/ho...
- 215
- 1850
- 0
-
-
- 0000 0004 4681 910X
-
-
- Q397466
- grid.417771.3
-
-
- 1
- https://ror.org/02bnkt322
- Bern University of Applied Sciences (BFH)
-
- Haute école spécialisée bernoise
- Berner Fachhochschule
-
- http://www.bfh.ch/en/home.html
- 215
- 1997
- 0
- BFH
-
- 0000 0001 0688 6779
- 501100006259
- 4365265
- Q466455
- grid.424060.4
-
-
- 2
- https://ror.org/05a28rw58
- ETH Zurich (ETH Zurich)
-
- École Polytechnique Fédérale de Zurich
- Eidgenössische Technische Hochschule Zürich
- Politecnico federale di Zurigo
- https://www.ethz.ch/en.html
- 215
- 1855
- 0
- ETH Zurich
- Swiss Federal Institute of Technology in Zuric...
- 0000 0001 2156 2780
- 501100003006
- 210910
- Q11942
- grid.5801.c
-
-
- 3
- https://ror.org/01ggx4157
- European Organization for Nuclear Research (CERN)
-
- Organisation européenne pour la recherche nucl...
- Europäische Organisation für Kernforschung
-
- http://home.web.cern.ch/
- 215
- 1954
- 0
- CERN
-
- 0000 0001 2156 142X
-
- 37351
- Q42944
- grid.9132.9
-
-
- 4
- https://ror.org/0210tb741
- Forschungsinstitut für biologischen Landbau (F...
-
-
-
-
- https://www.fibl.org/en/germany/location-de.html
- 215
-
- 0
- FiBL
-
-
-
-
-
- grid.506220.3
-
-
- 5
- https://ror.org/007ygn379
- Graduate Institute of International and Develo...
-
- Institut de Hautes études Internationales et d...
- Hochschulinstitut für internationale Studien u...
-
- http://graduateinstitute.ch/home.html
- 215
- 1927
- 0
- IHEID
- Graduate Institute Geneva
- 0000 0001 2296 9873
-
- 14744053
- Q691686
- grid.424404.2
-
-
- 6
- https://ror.org/015pmkr43
- Haute École Pédagogique BEJUNE (HEP BEJUNE)
-
-
-
-
- http://www.hep-bejune.ch/
- 215
- 2001
- 0
- HEP BEJUNE
-
- 0000 0001 0658 3479
-
-
-
- grid.469449.2
-
-
- 7
- https://ror.org/048gre751
- Haute École Pédagogique Fribourg (HEP-PH FR)
-
-
-
-
- https://www.hepfr.ch/
- 215
- 1990
- 0
- HEP-PH FR
-
- 0000 0001 0266 4909
-
-
-
- grid.469451.b
-
-
- 8
- https://ror.org/01bvm0h13
- Haute École Pédagogique du Canton de Vaud (HEP...
-
-
-
-
- http://www.hepl.ch/cms/accueil.html
- 215
- 2001
- 0
- HEP Vaud
-
- 0000 0004 0613 4050
-
-
-
- grid.466224.0
-
-
- 9
- https://ror.org/049c2kr37
- Kalaidos University of Applied Sciences (Kalai...
-
-
- Kalaidos Fachhochschule
-
- https://www.kalaidos-fh.ch/de-CH
- 215
- 1995
- 0
- Kalaidos UAS
-
- 0000 0004 0453 9054
-
- 6746630
- Q681372
- grid.449532.d
-
-
- 10
- https://ror.org/021f7p178
- Lib4RI - Library for the Research Institutes w...
-
-
-
-
- http://www.lib4ri.ch/
- 215
- 2011
- 0
-
- Lib4RI
- 0000 0004 0624 8541
-
-
- Q1278450
- grid.458352.d
-
-
- 11
- https://ror.org/04nd0xd48
- Lucerne University of Applied Sciences and Arts
-
- Haute École de lucerne
- Hochschule Luzern
-
- https://www.hslu.ch/en/
- 215
- 1997
- 0
-
-
- 0000 0001 2191 8943
-
- 19480920
- Q664028
- grid.425064.1
-
-
- 12
- https://ror.org/00p9jf779
- Medicines for Malaria Venture (MMV)
-
-
-
-
- http://www.mmv.org/
- 215
- 1999
- 0
- MMV
-
- 0000 0004 0432 5267
- 501100004167
-
- Q6806774
- grid.452605.0
-
-
- 13
- https://ror.org/038mj2660
- Ostschweizer Fachhochschule OST
- Eastern Switzerland University of Applied Scie...
-
-
-
- https://www.ost.ch/
- 215
- 1999
- 0
-
-
-
-
-
-
- grid.510272.3
-
-
- 14
- https://ror.org/05jf1ma54
- Pädagogische Hochschule Bern
- Bern University of Teacher Education
-
-
-
- https://www.phbern.ch
- 215
- 2005
- 0
-
- PHBern
- 0000 0000 8585 5665
-
-
-
- grid.454333.6
-
-
- 15
- https://ror.org/02fjgft97
- Pädagogische Hochschule Graubünden (PHGR)
-
-
-
- Alta scuola pedagogica dei Grigioni
- http://www.phgr.ch/
- 215
-
- 0
- PHGR
-
- 0000 0000 9317 283X
-
-
-
- grid.469478.0
-
-
- 16
- https://ror.org/03fs41j10
- Pädagogische Hochschule Schaffhausen (PHSH)
-
-
-
-
- http://www.phsh.ch/
- 215
- 2003
- 0
- PHSH
-
- 0000 0004 0450 7546
-
-
-
- grid.466133.5
-
-
- 17
- https://ror.org/04bf6dq94
- Pädagogische Hochschule Thurgau (PHTG)
-
-
-
-
- http://www.phtg.ch/home/
- 215
- 2003
- 0
- PHTG
-
- 0000 0004 0613 3824
-
-
-
- grid.466322.7
-
-
- 18
- https://ror.org/040gs8e06
- Pädagogische Hochschule Wallis (PH-VS)
-
- Haute École Pédagogique du Valais
-
-
- http://www.hepvs.ch/de
- 215
- 2000
- 0
- PH-VS
-
- 0000 0001 2178 3217
-
-
-
- grid.466216.1
-
-
- 19
- https://ror.org/00rqdn375
- Schwyz University of Teacher Education (PHSZ)
-
-
- Pädagogische Hochschule Schwyz
-
- https://www.phsz.ch/en/
- 215
-
- 0
- PHSZ
- PHZ Schwyz
- 0000 0004 0613 7454
-
-
-
- grid.466169.a
-
-
- 20
- https://ror.org/05m37v666
- St.Gallen University of Teacher Education (PHSG)
-
-
- Pädagogische Hochschule St. Gallen
-
- https://www.phsg.ch/en
- 215
- 2007
- 0
- PHSG
-
- 0000 0001 0271 5139
-
-
- Q1768652
- grid.466208.e
-
-
- 21
- https://ror.org/00zg4za48
- Swiss Federal Institute for Vocational Educati...
-
- Institut Fédéral des Hautes Études en Formatio...
- Eidgenössisches Hochschulinstitut für Berufsbi...
-
- http://www.ehb-schweiz.ch/en/
- 215
- 2007
- 0
- SFIVET
-
- 0000 0001 2285 5681
-
-
- Q1302632
- grid.466173.1
-
-
- 22
- https://ror.org/03mcsbr76
- Swiss Ornithological Institute
-
-
- Schweizerische Vogelwarte
-
- http://www.vogelwarte.ch/de/home/
- 215
- 1924
- 0
-
-
- 0000 0001 1512 3677
-
-
- Q663638
- grid.419767.a
-
-
- 23
- https://ror.org/03c4atk17
- Universita della Svizzera Italiana (USI)
- University of Italian Switzerland
- Université de la suisse italienne
-
- Università della Svizzera italiana
- http://www.usi.ch/en/index.htm
- 215
- 1996
- 0
- USI
-
- 0000 0001 2203 2861
-
- 2290642
- Q689617
- grid.29078.34
-
-
- 24
- https://ror.org/04mq2g308
- University of Applied Sciences and Arts Northw...
-
-
-
-
- http://www.fhnw.ch/homepage
- 215
- 2006
- 0
- FHNW
- Fachhochschule Nordwestschweiz
- 0000 0001 1497 8091
-
-
-
- grid.410380.e
-
-
- 25
- https://ror.org/01xkakk17
- University of Applied Sciences and Arts Wester...
-
- Haute École Spécialisée de Suisse Occidentale
- Fachhochschule Westschweiz
-
- http://www.hes-so.ch/en/homepage-hes-so-1679.html
- 215
- 1998
- 0
- HES-SO
-
- 0000 0001 0943 1999
-
- 10128956
- Q168003
- grid.5681.a
-
-
- 26
- https://ror.org/05ep8g269
- University of Applied Sciences and Arts of Sou...
-
-
-
- Scuola Universitaria Professionale della Svizz...
- http://www.supsi.ch/home_en.html
- 215
- 1997
- 0
- SUPSI
-
- 0000000123252233
-
- 34066841
- Q663984
- grid.16058.3a
-
-
- 27
- https://ror.org/00w9q2c06
- University of Applied Sciences of Special Need...
-
-
- Interkantonale Hochschule für Heilpädagogik
-
- http://www.hfh.ch/en/
- 215
- 1924
- 0
- HfH
- Zurich Training College for Teachers of Specia...
- 0000 0001 0710 6332
-
-
-
- grid.466279.8
-
-
- 28
- https://ror.org/032ymzc07
- University of Applied Sciences of the Grisons
-
-
- Fachhochschule Graubünden
-
- https://www.fhgr.ch/en/
- 215
- 1963
- 0
-
- Hochschule für Technik und Wirtschaft Chur
- 0000 0000 8718 2812
-
-
- Q1622220
- grid.460104.7
-
-
- 29
- https://ror.org/02s6k3f65
- University of Basel
-
- Université de bâle
- Universität Basel
- Università di Basilea
- https://www.unibas.ch/de
- 215
- 1460
- 0
-
-
- 0000 0004 1937 0642
- 100008375
- 427614
- Q372608
- grid.6612.3
-
-
- 30
- https://ror.org/02k7v4d05
- University of Bern (UB)
-
- Université de Berne
- Universität Bern
- Università di Berna
- http://www.unibe.ch/eng/
- 215
- 1834
- 0
- UB
-
- 0000 0001 0726 5157
- 100009068
- 1157515
- Q659080
- grid.5734.5
-
-
- 31
- https://ror.org/022fs9h90
- University of Fribourg
-
- Université de Fribourg
- Universität Freiburg
- Università di Friburgo
- http://www.unifr.ch/home/welcomeE.php
- 215
- 1889
- 0
-
-
- 0000 0004 0478 1713
- 501100005869
- 535267
- Q36188
- grid.8534.a
-
-
- 33
- https://ror.org/019whta54
- University of Lausanne (UNIL)
-
- Université de Lausanne
- Universität Lausanne
- Università di Losanna
- http://www.unil.ch/central/en/home.html
- 215
- 1537
- 0
- UNIL
- Schola Lausannensis
- 0000 0001 2165 4204
- 501100006390
- 79810
- Q658975
- grid.9851.5
-
-
- 34
- https://ror.org/01qjrx392
- University of Liechtenstein
-
-
- Universität Liechtenstein
-
- https://www.uni.li/study/de/
- 128
- 1961
- 0
-
-
- 0000 0001 2227 4668
-
- 10554064
- Q974328
- grid.445905.9
-
-
- 35
- https://ror.org/00kgrkn83
- University of Lucerne (UNILU)
-
- Université de lucerne
- Universität Luzern
- Università di Lucerna
- http://www.unilu.ch/
- 215
- 2000
- 0
- UNILU
-
- 0000 0001 1456 7938
-
- 21004764
- Q673308
- grid.449852.6
-
-
- 36
- https://ror.org/00vasag41
- University of Neuchâtel
-
- Université de neuchâtel
- Universität Neuenburg
-
- http://www2.unine.ch/
- 215
- 1838
- 0
-
-
- 0000 0001 2297 7718
- 501100005353
- 3662101
- Q541548
- grid.10711.36
-
-
- 37
- https://ror.org/0561a3s31
- University of St. Gallen (HSG)
-
- Université de saint-gall
- Universität St. Gallen
- Università di San Gallo
- http://www.es.unisg.ch/en/
- 215
- 1898
- 0
- HSG
-
- 0000 0001 2156 6618
- 100009572
- 751473
- Q673354
- grid.15775.31
-
-
- 38
- https://ror.org/0235ynq74
- University of Teacher Education Lucerne
-
-
- Pädagogische Hochschule Luzern
-
- http://www.phlu.ch/ute-lucerne/
- 215
- 2003
- 0
-
- PH Luzern
- 0000 0001 0348 1637
-
-
-
- grid.465965.d
-
-
- 39
- https://ror.org/05ghhx264
- University of Teacher Education Zug (PH Zug)
-
-
- Pädagogische Hochschule Zug
-
- https://www.zg.ch/behoerden/direktion-fur-bild...
- 215
- 2013
- 0
- PH Zug
-
- 0000 0004 0449 2225
-
-
-
- grid.466274.5
-
-
- 40
- https://ror.org/02crff812
- University of Zurich (UZH)
-
- Université de zurich
- Universität Zürich
- Università di Zurigo
- http://www.uzh.ch/index_en.html
- 215
- 1833
- 0
- UZH
-
- 0000 0004 1937 0650
- 501100006447
- 314803
- Q206702
- grid.7400.3
-
-
- 41
- https://ror.org/05pmsvm27
- Zurich University of Applied Sciences (ZHAW)
-
-
- Zürcher Hochschule für Angewandte Wissenschaften
-
- https://www.zhaw.ch/en/university/
- 215
- 2007
- 0
- ZHAW
-
- 0000000122291644
-
- 30930550
- Q2605554
- grid.19739.35
-
-
- 42
- https://ror.org/02ejkey04
- Zurich University of Applied Sciences in Busin...
-
-
- Hochschule für Wirtschaft Zürich
-
- http://www.fh-hwz.ch/en
- 215
- 1986
- 0
- HWZ
-
- 0000 0001 0008 3713
-
- 30805829
- Q1488771
- grid.449909.9
-
-
- 43
- https://ror.org/01awgk221
- Zurich University of Teacher Education (PHZH)
-
-
- Pädagogische Hochschule Zürich
-
- https://phzh.ch/en/
- 215
- 2002
- 0
- PHZH
- PH Zürich
- 0000 0000 9666 1858
-
-
-
- grid.483054.e
-
-
- 44
- https://ror.org/05r0ap620
- Zurich University of the Arts
-
- Haute École d'Art de Zurich
- Zürcher Hochschule der Künste
-
- https://www.zhdk.ch/
- 215
- 2007
- 0
-
-
-
-
- 39250592
- Q222450
- grid.449912.3
-
-
- 45
- https://ror.org/02s376052
- École Polytechnique Fédérale de Lausanne (EPFL)
- Swiss Federal Institute of Technology in Lausanne
-
-
-
- http://www.epfl.ch/index.en.html
- 215
- 1853
- 0
- EPFL
-
- 0000000121839049
- 501100001703
- 71968
- Q262760
- grid.5333.6
-
-
-
-
-
-
-
-
-```python
-organization = organization.reset_index(drop=True)
-organization
-```
-
-
-
-
-
-
-
-
-
-
- ror
- name
- label_en
- label_fr
- label_de
- label_it
- website
- country
- starting_year
- is_funder
- acronym
- aliases
- isni
- fundref
- orgref
- wikidata
- grid
-
-
-
-
- 0
- https://ror.org/01swzsf04
- University of Geneva (UNIGE)
-
- Université de Genève
-
- Università di Ginevra
- https://www.unige.ch/
- 215
- 1559
- 0
- UNIGE
- Schola Genevensis
- 0000 0001 2322 4988
- 501100006389
- 342348
- Q503473
- grid.8591.5
-
-
- 1
- https://ror.org/04d8ztx87
- Agroscope
-
-
-
-
- https://www.agroscope.admin.ch/agroscope/en/ho...
- 215
- 1850
- 0
-
-
- 0000 0004 4681 910X
-
-
- Q397466
- grid.417771.3
-
-
- 2
- https://ror.org/02bnkt322
- Bern University of Applied Sciences (BFH)
-
- Haute école spécialisée bernoise
- Berner Fachhochschule
-
- http://www.bfh.ch/en/home.html
- 215
- 1997
- 0
- BFH
-
- 0000 0001 0688 6779
- 501100006259
- 4365265
- Q466455
- grid.424060.4
-
-
- 3
- https://ror.org/05a28rw58
- ETH Zurich (ETH Zurich)
-
- École Polytechnique Fédérale de Zurich
- Eidgenössische Technische Hochschule Zürich
- Politecnico federale di Zurigo
- https://www.ethz.ch/en.html
- 215
- 1855
- 0
- ETH Zurich
- Swiss Federal Institute of Technology in Zuric...
- 0000 0001 2156 2780
- 501100003006
- 210910
- Q11942
- grid.5801.c
-
-
- 4
- https://ror.org/01ggx4157
- European Organization for Nuclear Research (CERN)
-
- Organisation européenne pour la recherche nucl...
- Europäische Organisation für Kernforschung
-
- http://home.web.cern.ch/
- 215
- 1954
- 0
- CERN
-
- 0000 0001 2156 142X
-
- 37351
- Q42944
- grid.9132.9
-
-
- 5
- https://ror.org/0210tb741
- Forschungsinstitut für biologischen Landbau (F...
-
-
-
-
- https://www.fibl.org/en/germany/location-de.html
- 215
-
- 0
- FiBL
-
-
-
-
-
- grid.506220.3
-
-
- 6
- https://ror.org/007ygn379
- Graduate Institute of International and Develo...
-
- Institut de Hautes études Internationales et d...
- Hochschulinstitut für internationale Studien u...
-
- http://graduateinstitute.ch/home.html
- 215
- 1927
- 0
- IHEID
- Graduate Institute Geneva
- 0000 0001 2296 9873
-
- 14744053
- Q691686
- grid.424404.2
-
-
- 7
- https://ror.org/015pmkr43
- Haute École Pédagogique BEJUNE (HEP BEJUNE)
-
-
-
-
- http://www.hep-bejune.ch/
- 215
- 2001
- 0
- HEP BEJUNE
-
- 0000 0001 0658 3479
-
-
-
- grid.469449.2
-
-
- 8
- https://ror.org/048gre751
- Haute École Pédagogique Fribourg (HEP-PH FR)
-
-
-
-
- https://www.hepfr.ch/
- 215
- 1990
- 0
- HEP-PH FR
-
- 0000 0001 0266 4909
-
-
-
- grid.469451.b
-
-
- 9
- https://ror.org/01bvm0h13
- Haute École Pédagogique du Canton de Vaud (HEP...
-
-
-
-
- http://www.hepl.ch/cms/accueil.html
- 215
- 2001
- 0
- HEP Vaud
-
- 0000 0004 0613 4050
-
-
-
- grid.466224.0
-
-
- 10
- https://ror.org/049c2kr37
- Kalaidos University of Applied Sciences (Kalai...
-
-
- Kalaidos Fachhochschule
-
- https://www.kalaidos-fh.ch/de-CH
- 215
- 1995
- 0
- Kalaidos UAS
-
- 0000 0004 0453 9054
-
- 6746630
- Q681372
- grid.449532.d
-
-
- 11
- https://ror.org/021f7p178
- Lib4RI - Library for the Research Institutes w...
-
-
-
-
- http://www.lib4ri.ch/
- 215
- 2011
- 0
-
- Lib4RI
- 0000 0004 0624 8541
-
-
- Q1278450
- grid.458352.d
-
-
- 12
- https://ror.org/04nd0xd48
- Lucerne University of Applied Sciences and Arts
-
- Haute École de lucerne
- Hochschule Luzern
-
- https://www.hslu.ch/en/
- 215
- 1997
- 0
-
-
- 0000 0001 2191 8943
-
- 19480920
- Q664028
- grid.425064.1
-
-
- 13
- https://ror.org/00p9jf779
- Medicines for Malaria Venture (MMV)
-
-
-
-
- http://www.mmv.org/
- 215
- 1999
- 0
- MMV
-
- 0000 0004 0432 5267
- 501100004167
-
- Q6806774
- grid.452605.0
-
-
- 14
- https://ror.org/038mj2660
- Ostschweizer Fachhochschule OST
- Eastern Switzerland University of Applied Scie...
-
-
-
- https://www.ost.ch/
- 215
- 1999
- 0
-
-
-
-
-
-
- grid.510272.3
-
-
- 15
- https://ror.org/05jf1ma54
- Pädagogische Hochschule Bern
- Bern University of Teacher Education
-
-
-
- https://www.phbern.ch
- 215
- 2005
- 0
-
- PHBern
- 0000 0000 8585 5665
-
-
-
- grid.454333.6
-
-
- 16
- https://ror.org/02fjgft97
- Pädagogische Hochschule Graubünden (PHGR)
-
-
-
- Alta scuola pedagogica dei Grigioni
- http://www.phgr.ch/
- 215
-
- 0
- PHGR
-
- 0000 0000 9317 283X
-
-
-
- grid.469478.0
-
-
- 17
- https://ror.org/03fs41j10
- Pädagogische Hochschule Schaffhausen (PHSH)
-
-
-
-
- http://www.phsh.ch/
- 215
- 2003
- 0
- PHSH
-
- 0000 0004 0450 7546
-
-
-
- grid.466133.5
-
-
- 18
- https://ror.org/04bf6dq94
- Pädagogische Hochschule Thurgau (PHTG)
-
-
-
-
- http://www.phtg.ch/home/
- 215
- 2003
- 0
- PHTG
-
- 0000 0004 0613 3824
-
-
-
- grid.466322.7
-
-
- 19
- https://ror.org/040gs8e06
- Pädagogische Hochschule Wallis (PH-VS)
-
- Haute École Pédagogique du Valais
-
-
- http://www.hepvs.ch/de
- 215
- 2000
- 0
- PH-VS
-
- 0000 0001 2178 3217
-
-
-
- grid.466216.1
-
-
- 20
- https://ror.org/00rqdn375
- Schwyz University of Teacher Education (PHSZ)
-
-
- Pädagogische Hochschule Schwyz
-
- https://www.phsz.ch/en/
- 215
-
- 0
- PHSZ
- PHZ Schwyz
- 0000 0004 0613 7454
-
-
-
- grid.466169.a
-
-
- 21
- https://ror.org/05m37v666
- St.Gallen University of Teacher Education (PHSG)
-
-
- Pädagogische Hochschule St. Gallen
-
- https://www.phsg.ch/en
- 215
- 2007
- 0
- PHSG
-
- 0000 0001 0271 5139
-
-
- Q1768652
- grid.466208.e
-
-
- 22
- https://ror.org/00zg4za48
- Swiss Federal Institute for Vocational Educati...
-
- Institut Fédéral des Hautes Études en Formatio...
- Eidgenössisches Hochschulinstitut für Berufsbi...
-
- http://www.ehb-schweiz.ch/en/
- 215
- 2007
- 0
- SFIVET
-
- 0000 0001 2285 5681
-
-
- Q1302632
- grid.466173.1
-
-
- 23
- https://ror.org/03mcsbr76
- Swiss Ornithological Institute
-
-
- Schweizerische Vogelwarte
-
- http://www.vogelwarte.ch/de/home/
- 215
- 1924
- 0
-
-
- 0000 0001 1512 3677
-
-
- Q663638
- grid.419767.a
-
-
- 24
- https://ror.org/03c4atk17
- Universita della Svizzera Italiana (USI)
- University of Italian Switzerland
- Université de la suisse italienne
-
- Università della Svizzera italiana
- http://www.usi.ch/en/index.htm
- 215
- 1996
- 0
- USI
-
- 0000 0001 2203 2861
-
- 2290642
- Q689617
- grid.29078.34
-
-
- 25
- https://ror.org/04mq2g308
- University of Applied Sciences and Arts Northw...
-
-
-
-
- http://www.fhnw.ch/homepage
- 215
- 2006
- 0
- FHNW
- Fachhochschule Nordwestschweiz
- 0000 0001 1497 8091
-
-
-
- grid.410380.e
-
-
- 26
- https://ror.org/01xkakk17
- University of Applied Sciences and Arts Wester...
-
- Haute École Spécialisée de Suisse Occidentale
- Fachhochschule Westschweiz
-
- http://www.hes-so.ch/en/homepage-hes-so-1679.html
- 215
- 1998
- 0
- HES-SO
-
- 0000 0001 0943 1999
-
- 10128956
- Q168003
- grid.5681.a
-
-
- 27
- https://ror.org/05ep8g269
- University of Applied Sciences and Arts of Sou...
-
-
-
- Scuola Universitaria Professionale della Svizz...
- http://www.supsi.ch/home_en.html
- 215
- 1997
- 0
- SUPSI
-
- 0000000123252233
-
- 34066841
- Q663984
- grid.16058.3a
-
-
- 28
- https://ror.org/00w9q2c06
- University of Applied Sciences of Special Need...
-
-
- Interkantonale Hochschule für Heilpädagogik
-
- http://www.hfh.ch/en/
- 215
- 1924
- 0
- HfH
- Zurich Training College for Teachers of Specia...
- 0000 0001 0710 6332
-
-
-
- grid.466279.8
-
-
- 29
- https://ror.org/032ymzc07
- University of Applied Sciences of the Grisons
-
-
- Fachhochschule Graubünden
-
- https://www.fhgr.ch/en/
- 215
- 1963
- 0
-
- Hochschule für Technik und Wirtschaft Chur
- 0000 0000 8718 2812
-
-
- Q1622220
- grid.460104.7
-
-
- 30
- https://ror.org/02s6k3f65
- University of Basel
-
- Université de bâle
- Universität Basel
- Università di Basilea
- https://www.unibas.ch/de
- 215
- 1460
- 0
-
-
- 0000 0004 1937 0642
- 100008375
- 427614
- Q372608
- grid.6612.3
-
-
- 31
- https://ror.org/02k7v4d05
- University of Bern (UB)
-
- Université de Berne
- Universität Bern
- Università di Berna
- http://www.unibe.ch/eng/
- 215
- 1834
- 0
- UB
-
- 0000 0001 0726 5157
- 100009068
- 1157515
- Q659080
- grid.5734.5
-
-
- 32
- https://ror.org/022fs9h90
- University of Fribourg
-
- Université de Fribourg
- Universität Freiburg
- Università di Friburgo
- http://www.unifr.ch/home/welcomeE.php
- 215
- 1889
- 0
-
-
- 0000 0004 0478 1713
- 501100005869
- 535267
- Q36188
- grid.8534.a
-
-
- 33
- https://ror.org/019whta54
- University of Lausanne (UNIL)
-
- Université de Lausanne
- Universität Lausanne
- Università di Losanna
- http://www.unil.ch/central/en/home.html
- 215
- 1537
- 0
- UNIL
- Schola Lausannensis
- 0000 0001 2165 4204
- 501100006390
- 79810
- Q658975
- grid.9851.5
-
-
- 34
- https://ror.org/01qjrx392
- University of Liechtenstein
-
-
- Universität Liechtenstein
-
- https://www.uni.li/study/de/
- 128
- 1961
- 0
-
-
- 0000 0001 2227 4668
-
- 10554064
- Q974328
- grid.445905.9
-
-
- 35
- https://ror.org/00kgrkn83
- University of Lucerne (UNILU)
-
- Université de lucerne
- Universität Luzern
- Università di Lucerna
- http://www.unilu.ch/
- 215
- 2000
- 0
- UNILU
-
- 0000 0001 1456 7938
-
- 21004764
- Q673308
- grid.449852.6
-
-
- 36
- https://ror.org/00vasag41
- University of Neuchâtel
-
- Université de neuchâtel
- Universität Neuenburg
-
- http://www2.unine.ch/
- 215
- 1838
- 0
-
-
- 0000 0001 2297 7718
- 501100005353
- 3662101
- Q541548
- grid.10711.36
-
-
- 37
- https://ror.org/0561a3s31
- University of St. Gallen (HSG)
-
- Université de saint-gall
- Universität St. Gallen
- Università di San Gallo
- http://www.es.unisg.ch/en/
- 215
- 1898
- 0
- HSG
-
- 0000 0001 2156 6618
- 100009572
- 751473
- Q673354
- grid.15775.31
-
-
- 38
- https://ror.org/0235ynq74
- University of Teacher Education Lucerne
-
-
- Pädagogische Hochschule Luzern
-
- http://www.phlu.ch/ute-lucerne/
- 215
- 2003
- 0
-
- PH Luzern
- 0000 0001 0348 1637
-
-
-
- grid.465965.d
-
-
- 39
- https://ror.org/05ghhx264
- University of Teacher Education Zug (PH Zug)
-
-
- Pädagogische Hochschule Zug
-
- https://www.zg.ch/behoerden/direktion-fur-bild...
- 215
- 2013
- 0
- PH Zug
-
- 0000 0004 0449 2225
-
-
-
- grid.466274.5
-
-
- 40
- https://ror.org/02crff812
- University of Zurich (UZH)
-
- Université de zurich
- Universität Zürich
- Università di Zurigo
- http://www.uzh.ch/index_en.html
- 215
- 1833
- 0
- UZH
-
- 0000 0004 1937 0650
- 501100006447
- 314803
- Q206702
- grid.7400.3
-
-
- 41
- https://ror.org/05pmsvm27
- Zurich University of Applied Sciences (ZHAW)
-
-
- Zürcher Hochschule für Angewandte Wissenschaften
-
- https://www.zhaw.ch/en/university/
- 215
- 2007
- 0
- ZHAW
-
- 0000000122291644
-
- 30930550
- Q2605554
- grid.19739.35
-
-
- 42
- https://ror.org/02ejkey04
- Zurich University of Applied Sciences in Busin...
-
-
- Hochschule für Wirtschaft Zürich
-
- http://www.fh-hwz.ch/en
- 215
- 1986
- 0
- HWZ
-
- 0000 0001 0008 3713
-
- 30805829
- Q1488771
- grid.449909.9
-
-
- 43
- https://ror.org/01awgk221
- Zurich University of Teacher Education (PHZH)
-
-
- Pädagogische Hochschule Zürich
-
- https://phzh.ch/en/
- 215
- 2002
- 0
- PHZH
- PH Zürich
- 0000 0000 9666 1858
-
-
-
- grid.483054.e
-
-
- 44
- https://ror.org/05r0ap620
- Zurich University of the Arts
-
- Haute École d'Art de Zurich
- Zürcher Hochschule der Künste
-
- https://www.zhdk.ch/
- 215
- 2007
- 0
-
-
-
-
- 39250592
- Q222450
- grid.449912.3
-
-
- 45
- https://ror.org/02s376052
- École Polytechnique Fédérale de Lausanne (EPFL)
- Swiss Federal Institute of Technology in Lausanne
-
-
-
- http://www.epfl.ch/index.en.html
- 215
- 1853
- 0
- EPFL
-
- 0000000121839049
- 501100001703
- 71968
- Q262760
- grid.5333.6
-
-
-
-
-
-
-
-
-```python
-# mettre l'EPFL en position 1 et UNIGE en 2
-target_row = 45
-# Move target row to first element of list.
-idx = [target_row] + [i for i in range(len(organization)) if i != target_row]
-organization = organization.iloc[idx]
-organization
-```
-
-
-
-
-
-
-
-
-
-
- ror
- name
- label_en
- label_fr
- label_de
- label_it
- website
- country
- starting_year
- is_funder
- acronym
- aliases
- isni
- fundref
- orgref
- wikidata
- grid
-
-
-
-
- 45
- https://ror.org/02s376052
- École Polytechnique Fédérale de Lausanne (EPFL)
- Swiss Federal Institute of Technology in Lausanne
-
-
-
- http://www.epfl.ch/index.en.html
- 215
- 1853
- 0
- EPFL
-
- 0000000121839049
- 501100001703
- 71968
- Q262760
- grid.5333.6
-
-
- 0
- https://ror.org/01swzsf04
- University of Geneva (UNIGE)
-
- Université de Genève
-
- Università di Ginevra
- https://www.unige.ch/
- 215
- 1559
- 0
- UNIGE
- Schola Genevensis
- 0000 0001 2322 4988
- 501100006389
- 342348
- Q503473
- grid.8591.5
-
-
- 1
- https://ror.org/04d8ztx87
- Agroscope
-
-
-
-
- https://www.agroscope.admin.ch/agroscope/en/ho...
- 215
- 1850
- 0
-
-
- 0000 0004 4681 910X
-
-
- Q397466
- grid.417771.3
-
-
- 2
- https://ror.org/02bnkt322
- Bern University of Applied Sciences (BFH)
-
- Haute école spécialisée bernoise
- Berner Fachhochschule
-
- http://www.bfh.ch/en/home.html
- 215
- 1997
- 0
- BFH
-
- 0000 0001 0688 6779
- 501100006259
- 4365265
- Q466455
- grid.424060.4
-
-
- 3
- https://ror.org/05a28rw58
- ETH Zurich (ETH Zurich)
-
- École Polytechnique Fédérale de Zurich
- Eidgenössische Technische Hochschule Zürich
- Politecnico federale di Zurigo
- https://www.ethz.ch/en.html
- 215
- 1855
- 0
- ETH Zurich
- Swiss Federal Institute of Technology in Zuric...
- 0000 0001 2156 2780
- 501100003006
- 210910
- Q11942
- grid.5801.c
-
-
- 4
- https://ror.org/01ggx4157
- European Organization for Nuclear Research (CERN)
-
- Organisation européenne pour la recherche nucl...
- Europäische Organisation für Kernforschung
-
- http://home.web.cern.ch/
- 215
- 1954
- 0
- CERN
-
- 0000 0001 2156 142X
-
- 37351
- Q42944
- grid.9132.9
-
-
- 5
- https://ror.org/0210tb741
- Forschungsinstitut für biologischen Landbau (F...
-
-
-
-
- https://www.fibl.org/en/germany/location-de.html
- 215
-
- 0
- FiBL
-
-
-
-
-
- grid.506220.3
-
-
- 6
- https://ror.org/007ygn379
- Graduate Institute of International and Develo...
-
- Institut de Hautes études Internationales et d...
- Hochschulinstitut für internationale Studien u...
-
- http://graduateinstitute.ch/home.html
- 215
- 1927
- 0
- IHEID
- Graduate Institute Geneva
- 0000 0001 2296 9873
-
- 14744053
- Q691686
- grid.424404.2
-
-
- 7
- https://ror.org/015pmkr43
- Haute École Pédagogique BEJUNE (HEP BEJUNE)
-
-
-
-
- http://www.hep-bejune.ch/
- 215
- 2001
- 0
- HEP BEJUNE
-
- 0000 0001 0658 3479
-
-
-
- grid.469449.2
-
-
- 8
- https://ror.org/048gre751
- Haute École Pédagogique Fribourg (HEP-PH FR)
-
-
-
-
- https://www.hepfr.ch/
- 215
- 1990
- 0
- HEP-PH FR
-
- 0000 0001 0266 4909
-
-
-
- grid.469451.b
-
-
- 9
- https://ror.org/01bvm0h13
- Haute École Pédagogique du Canton de Vaud (HEP...
-
-
-
-
- http://www.hepl.ch/cms/accueil.html
- 215
- 2001
- 0
- HEP Vaud
-
- 0000 0004 0613 4050
-
-
-
- grid.466224.0
-
-
- 10
- https://ror.org/049c2kr37
- Kalaidos University of Applied Sciences (Kalai...
-
-
- Kalaidos Fachhochschule
-
- https://www.kalaidos-fh.ch/de-CH
- 215
- 1995
- 0
- Kalaidos UAS
-
- 0000 0004 0453 9054
-
- 6746630
- Q681372
- grid.449532.d
-
-
- 11
- https://ror.org/021f7p178
- Lib4RI - Library for the Research Institutes w...
-
-
-
-
- http://www.lib4ri.ch/
- 215
- 2011
- 0
-
- Lib4RI
- 0000 0004 0624 8541
-
-
- Q1278450
- grid.458352.d
-
-
- 12
- https://ror.org/04nd0xd48
- Lucerne University of Applied Sciences and Arts
-
- Haute École de lucerne
- Hochschule Luzern
-
- https://www.hslu.ch/en/
- 215
- 1997
- 0
-
-
- 0000 0001 2191 8943
-
- 19480920
- Q664028
- grid.425064.1
-
-
- 13
- https://ror.org/00p9jf779
- Medicines for Malaria Venture (MMV)
-
-
-
-
- http://www.mmv.org/
- 215
- 1999
- 0
- MMV
-
- 0000 0004 0432 5267
- 501100004167
-
- Q6806774
- grid.452605.0
-
-
- 14
- https://ror.org/038mj2660
- Ostschweizer Fachhochschule OST
- Eastern Switzerland University of Applied Scie...
-
-
-
- https://www.ost.ch/
- 215
- 1999
- 0
-
-
-
-
-
-
- grid.510272.3
-
-
- 15
- https://ror.org/05jf1ma54
- Pädagogische Hochschule Bern
- Bern University of Teacher Education
-
-
-
- https://www.phbern.ch
- 215
- 2005
- 0
-
- PHBern
- 0000 0000 8585 5665
-
-
-
- grid.454333.6
-
-
- 16
- https://ror.org/02fjgft97
- Pädagogische Hochschule Graubünden (PHGR)
-
-
-
- Alta scuola pedagogica dei Grigioni
- http://www.phgr.ch/
- 215
-
- 0
- PHGR
-
- 0000 0000 9317 283X
-
-
-
- grid.469478.0
-
-
- 17
- https://ror.org/03fs41j10
- Pädagogische Hochschule Schaffhausen (PHSH)
-
-
-
-
- http://www.phsh.ch/
- 215
- 2003
- 0
- PHSH
-
- 0000 0004 0450 7546
-
-
-
- grid.466133.5
-
-
- 18
- https://ror.org/04bf6dq94
- Pädagogische Hochschule Thurgau (PHTG)
-
-
-
-
- http://www.phtg.ch/home/
- 215
- 2003
- 0
- PHTG
-
- 0000 0004 0613 3824
-
-
-
- grid.466322.7
-
-
- 19
- https://ror.org/040gs8e06
- Pädagogische Hochschule Wallis (PH-VS)
-
- Haute École Pédagogique du Valais
-
-
- http://www.hepvs.ch/de
- 215
- 2000
- 0
- PH-VS
-
- 0000 0001 2178 3217
-
-
-
- grid.466216.1
-
-
- 20
- https://ror.org/00rqdn375
- Schwyz University of Teacher Education (PHSZ)
-
-
- Pädagogische Hochschule Schwyz
-
- https://www.phsz.ch/en/
- 215
-
- 0
- PHSZ
- PHZ Schwyz
- 0000 0004 0613 7454
-
-
-
- grid.466169.a
-
-
- 21
- https://ror.org/05m37v666
- St.Gallen University of Teacher Education (PHSG)
-
-
- Pädagogische Hochschule St. Gallen
-
- https://www.phsg.ch/en
- 215
- 2007
- 0
- PHSG
-
- 0000 0001 0271 5139
-
-
- Q1768652
- grid.466208.e
-
-
- 22
- https://ror.org/00zg4za48
- Swiss Federal Institute for Vocational Educati...
-
- Institut Fédéral des Hautes Études en Formatio...
- Eidgenössisches Hochschulinstitut für Berufsbi...
-
- http://www.ehb-schweiz.ch/en/
- 215
- 2007
- 0
- SFIVET
-
- 0000 0001 2285 5681
-
-
- Q1302632
- grid.466173.1
-
-
- 23
- https://ror.org/03mcsbr76
- Swiss Ornithological Institute
-
-
- Schweizerische Vogelwarte
-
- http://www.vogelwarte.ch/de/home/
- 215
- 1924
- 0
-
-
- 0000 0001 1512 3677
-
-
- Q663638
- grid.419767.a
-
-
- 24
- https://ror.org/03c4atk17
- Universita della Svizzera Italiana (USI)
- University of Italian Switzerland
- Université de la suisse italienne
-
- Università della Svizzera italiana
- http://www.usi.ch/en/index.htm
- 215
- 1996
- 0
- USI
-
- 0000 0001 2203 2861
-
- 2290642
- Q689617
- grid.29078.34
-
-
- 25
- https://ror.org/04mq2g308
- University of Applied Sciences and Arts Northw...
-
-
-
-
- http://www.fhnw.ch/homepage
- 215
- 2006
- 0
- FHNW
- Fachhochschule Nordwestschweiz
- 0000 0001 1497 8091
-
-
-
- grid.410380.e
-
-
- 26
- https://ror.org/01xkakk17
- University of Applied Sciences and Arts Wester...
-
- Haute École Spécialisée de Suisse Occidentale
- Fachhochschule Westschweiz
-
- http://www.hes-so.ch/en/homepage-hes-so-1679.html
- 215
- 1998
- 0
- HES-SO
-
- 0000 0001 0943 1999
-
- 10128956
- Q168003
- grid.5681.a
-
-
- 27
- https://ror.org/05ep8g269
- University of Applied Sciences and Arts of Sou...
-
-
-
- Scuola Universitaria Professionale della Svizz...
- http://www.supsi.ch/home_en.html
- 215
- 1997
- 0
- SUPSI
-
- 0000000123252233
-
- 34066841
- Q663984
- grid.16058.3a
-
-
- 28
- https://ror.org/00w9q2c06
- University of Applied Sciences of Special Need...
-
-
- Interkantonale Hochschule für Heilpädagogik
-
- http://www.hfh.ch/en/
- 215
- 1924
- 0
- HfH
- Zurich Training College for Teachers of Specia...
- 0000 0001 0710 6332
-
-
-
- grid.466279.8
-
-
- 29
- https://ror.org/032ymzc07
- University of Applied Sciences of the Grisons
-
-
- Fachhochschule Graubünden
-
- https://www.fhgr.ch/en/
- 215
- 1963
- 0
-
- Hochschule für Technik und Wirtschaft Chur
- 0000 0000 8718 2812
-
-
- Q1622220
- grid.460104.7
-
-
- 30
- https://ror.org/02s6k3f65
- University of Basel
-
- Université de bâle
- Universität Basel
- Università di Basilea
- https://www.unibas.ch/de
- 215
- 1460
- 0
-
-
- 0000 0004 1937 0642
- 100008375
- 427614
- Q372608
- grid.6612.3
-
-
- 31
- https://ror.org/02k7v4d05
- University of Bern (UB)
-
- Université de Berne
- Universität Bern
- Università di Berna
- http://www.unibe.ch/eng/
- 215
- 1834
- 0
- UB
-
- 0000 0001 0726 5157
- 100009068
- 1157515
- Q659080
- grid.5734.5
-
-
- 32
- https://ror.org/022fs9h90
- University of Fribourg
-
- Université de Fribourg
- Universität Freiburg
- Università di Friburgo
- http://www.unifr.ch/home/welcomeE.php
- 215
- 1889
- 0
-
-
- 0000 0004 0478 1713
- 501100005869
- 535267
- Q36188
- grid.8534.a
-
-
- 33
- https://ror.org/019whta54
- University of Lausanne (UNIL)
-
- Université de Lausanne
- Universität Lausanne
- Università di Losanna
- http://www.unil.ch/central/en/home.html
- 215
- 1537
- 0
- UNIL
- Schola Lausannensis
- 0000 0001 2165 4204
- 501100006390
- 79810
- Q658975
- grid.9851.5
-
-
- 34
- https://ror.org/01qjrx392
- University of Liechtenstein
-
-
- Universität Liechtenstein
-
- https://www.uni.li/study/de/
- 128
- 1961
- 0
-
-
- 0000 0001 2227 4668
-
- 10554064
- Q974328
- grid.445905.9
-
-
- 35
- https://ror.org/00kgrkn83
- University of Lucerne (UNILU)
-
- Université de lucerne
- Universität Luzern
- Università di Lucerna
- http://www.unilu.ch/
- 215
- 2000
- 0
- UNILU
-
- 0000 0001 1456 7938
-
- 21004764
- Q673308
- grid.449852.6
-
-
- 36
- https://ror.org/00vasag41
- University of Neuchâtel
-
- Université de neuchâtel
- Universität Neuenburg
-
- http://www2.unine.ch/
- 215
- 1838
- 0
-
-
- 0000 0001 2297 7718
- 501100005353
- 3662101
- Q541548
- grid.10711.36
-
-
- 37
- https://ror.org/0561a3s31
- University of St. Gallen (HSG)
-
- Université de saint-gall
- Universität St. Gallen
- Università di San Gallo
- http://www.es.unisg.ch/en/
- 215
- 1898
- 0
- HSG
-
- 0000 0001 2156 6618
- 100009572
- 751473
- Q673354
- grid.15775.31
-
-
- 38
- https://ror.org/0235ynq74
- University of Teacher Education Lucerne
-
-
- Pädagogische Hochschule Luzern
-
- http://www.phlu.ch/ute-lucerne/
- 215
- 2003
- 0
-
- PH Luzern
- 0000 0001 0348 1637
-
-
-
- grid.465965.d
-
-
- 39
- https://ror.org/05ghhx264
- University of Teacher Education Zug (PH Zug)
-
-
- Pädagogische Hochschule Zug
-
- https://www.zg.ch/behoerden/direktion-fur-bild...
- 215
- 2013
- 0
- PH Zug
-
- 0000 0004 0449 2225
-
-
-
- grid.466274.5
-
-
- 40
- https://ror.org/02crff812
- University of Zurich (UZH)
-
- Université de zurich
- Universität Zürich
- Università di Zurigo
- http://www.uzh.ch/index_en.html
- 215
- 1833
- 0
- UZH
-
- 0000 0004 1937 0650
- 501100006447
- 314803
- Q206702
- grid.7400.3
-
-
- 41
- https://ror.org/05pmsvm27
- Zurich University of Applied Sciences (ZHAW)
-
-
- Zürcher Hochschule für Angewandte Wissenschaften
-
- https://www.zhaw.ch/en/university/
- 215
- 2007
- 0
- ZHAW
-
- 0000000122291644
-
- 30930550
- Q2605554
- grid.19739.35
-
-
- 42
- https://ror.org/02ejkey04
- Zurich University of Applied Sciences in Busin...
-
-
- Hochschule für Wirtschaft Zürich
-
- http://www.fh-hwz.ch/en
- 215
- 1986
- 0
- HWZ
-
- 0000 0001 0008 3713
-
- 30805829
- Q1488771
- grid.449909.9
-
-
- 43
- https://ror.org/01awgk221
- Zurich University of Teacher Education (PHZH)
-
-
- Pädagogische Hochschule Zürich
-
- https://phzh.ch/en/
- 215
- 2002
- 0
- PHZH
- PH Zürich
- 0000 0000 9666 1858
-
-
-
- grid.483054.e
-
-
- 44
- https://ror.org/05r0ap620
- Zurich University of the Arts
-
- Haute École d'Art de Zurich
- Zürcher Hochschule der Künste
-
- https://www.zhdk.ch/
- 215
- 2007
- 0
-
-
-
-
- 39250592
- Q222450
- grid.449912.3
-
-
-
-
-
-
-
-
-```python
-organization = organization.reset_index(drop=True)
-organization
-```
-
-
-
-
-
-
-
-
-
-
- ror
- name
- label_en
- label_fr
- label_de
- label_it
- website
- country
- starting_year
- is_funder
- acronym
- aliases
- isni
- fundref
- orgref
- wikidata
- grid
-
-
-
-
- 0
- https://ror.org/02s376052
- École Polytechnique Fédérale de Lausanne (EPFL)
- Swiss Federal Institute of Technology in Lausanne
-
-
-
- http://www.epfl.ch/index.en.html
- 215
- 1853
- 0
- EPFL
-
- 0000000121839049
- 501100001703
- 71968
- Q262760
- grid.5333.6
-
-
- 1
- https://ror.org/01swzsf04
- University of Geneva (UNIGE)
-
- Université de Genève
-
- Università di Ginevra
- https://www.unige.ch/
- 215
- 1559
- 0
- UNIGE
- Schola Genevensis
- 0000 0001 2322 4988
- 501100006389
- 342348
- Q503473
- grid.8591.5
-
-
- 2
- https://ror.org/04d8ztx87
- Agroscope
-
-
-
-
- https://www.agroscope.admin.ch/agroscope/en/ho...
- 215
- 1850
- 0
-
-
- 0000 0004 4681 910X
-
-
- Q397466
- grid.417771.3
-
-
- 3
- https://ror.org/02bnkt322
- Bern University of Applied Sciences (BFH)
-
- Haute école spécialisée bernoise
- Berner Fachhochschule
-
- http://www.bfh.ch/en/home.html
- 215
- 1997
- 0
- BFH
-
- 0000 0001 0688 6779
- 501100006259
- 4365265
- Q466455
- grid.424060.4
-
-
- 4
- https://ror.org/05a28rw58
- ETH Zurich (ETH Zurich)
-
- École Polytechnique Fédérale de Zurich
- Eidgenössische Technische Hochschule Zürich
- Politecnico federale di Zurigo
- https://www.ethz.ch/en.html
- 215
- 1855
- 0
- ETH Zurich
- Swiss Federal Institute of Technology in Zuric...
- 0000 0001 2156 2780
- 501100003006
- 210910
- Q11942
- grid.5801.c
-
-
- 5
- https://ror.org/01ggx4157
- European Organization for Nuclear Research (CERN)
-
- Organisation européenne pour la recherche nucl...
- Europäische Organisation für Kernforschung
-
- http://home.web.cern.ch/
- 215
- 1954
- 0
- CERN
-
- 0000 0001 2156 142X
-
- 37351
- Q42944
- grid.9132.9
-
-
- 6
- https://ror.org/0210tb741
- Forschungsinstitut für biologischen Landbau (F...
-
-
-
-
- https://www.fibl.org/en/germany/location-de.html
- 215
-
- 0
- FiBL
-
-
-
-
-
- grid.506220.3
-
-
- 7
- https://ror.org/007ygn379
- Graduate Institute of International and Develo...
-
- Institut de Hautes études Internationales et d...
- Hochschulinstitut für internationale Studien u...
-
- http://graduateinstitute.ch/home.html
- 215
- 1927
- 0
- IHEID
- Graduate Institute Geneva
- 0000 0001 2296 9873
-
- 14744053
- Q691686
- grid.424404.2
-
-
- 8
- https://ror.org/015pmkr43
- Haute École Pédagogique BEJUNE (HEP BEJUNE)
-
-
-
-
- http://www.hep-bejune.ch/
- 215
- 2001
- 0
- HEP BEJUNE
-
- 0000 0001 0658 3479
-
-
-
- grid.469449.2
-
-
- 9
- https://ror.org/048gre751
- Haute École Pédagogique Fribourg (HEP-PH FR)
-
-
-
-
- https://www.hepfr.ch/
- 215
- 1990
- 0
- HEP-PH FR
-
- 0000 0001 0266 4909
-
-
-
- grid.469451.b
-
-
- 10
- https://ror.org/01bvm0h13
- Haute École Pédagogique du Canton de Vaud (HEP...
-
-
-
-
- http://www.hepl.ch/cms/accueil.html
- 215
- 2001
- 0
- HEP Vaud
-
- 0000 0004 0613 4050
-
-
-
- grid.466224.0
-
-
- 11
- https://ror.org/049c2kr37
- Kalaidos University of Applied Sciences (Kalai...
-
-
- Kalaidos Fachhochschule
-
- https://www.kalaidos-fh.ch/de-CH
- 215
- 1995
- 0
- Kalaidos UAS
-
- 0000 0004 0453 9054
-
- 6746630
- Q681372
- grid.449532.d
-
-
- 12
- https://ror.org/021f7p178
- Lib4RI - Library for the Research Institutes w...
-
-
-
-
- http://www.lib4ri.ch/
- 215
- 2011
- 0
-
- Lib4RI
- 0000 0004 0624 8541
-
-
- Q1278450
- grid.458352.d
-
-
- 13
- https://ror.org/04nd0xd48
- Lucerne University of Applied Sciences and Arts
-
- Haute École de lucerne
- Hochschule Luzern
-
- https://www.hslu.ch/en/
- 215
- 1997
- 0
-
-
- 0000 0001 2191 8943
-
- 19480920
- Q664028
- grid.425064.1
-
-
- 14
- https://ror.org/00p9jf779
- Medicines for Malaria Venture (MMV)
-
-
-
-
- http://www.mmv.org/
- 215
- 1999
- 0
- MMV
-
- 0000 0004 0432 5267
- 501100004167
-
- Q6806774
- grid.452605.0
-
-
- 15
- https://ror.org/038mj2660
- Ostschweizer Fachhochschule OST
- Eastern Switzerland University of Applied Scie...
-
-
-
- https://www.ost.ch/
- 215
- 1999
- 0
-
-
-
-
-
-
- grid.510272.3
-
-
- 16
- https://ror.org/05jf1ma54
- Pädagogische Hochschule Bern
- Bern University of Teacher Education
-
-
-
- https://www.phbern.ch
- 215
- 2005
- 0
-
- PHBern
- 0000 0000 8585 5665
-
-
-
- grid.454333.6
-
-
- 17
- https://ror.org/02fjgft97
- Pädagogische Hochschule Graubünden (PHGR)
-
-
-
- Alta scuola pedagogica dei Grigioni
- http://www.phgr.ch/
- 215
-
- 0
- PHGR
-
- 0000 0000 9317 283X
-
-
-
- grid.469478.0
-
-
- 18
- https://ror.org/03fs41j10
- Pädagogische Hochschule Schaffhausen (PHSH)
-
-
-
-
- http://www.phsh.ch/
- 215
- 2003
- 0
- PHSH
-
- 0000 0004 0450 7546
-
-
-
- grid.466133.5
-
-
- 19
- https://ror.org/04bf6dq94
- Pädagogische Hochschule Thurgau (PHTG)
-
-
-
-
- http://www.phtg.ch/home/
- 215
- 2003
- 0
- PHTG
-
- 0000 0004 0613 3824
-
-
-
- grid.466322.7
-
-
- 20
- https://ror.org/040gs8e06
- Pädagogische Hochschule Wallis (PH-VS)
-
- Haute École Pédagogique du Valais
-
-
- http://www.hepvs.ch/de
- 215
- 2000
- 0
- PH-VS
-
- 0000 0001 2178 3217
-
-
-
- grid.466216.1
-
-
- 21
- https://ror.org/00rqdn375
- Schwyz University of Teacher Education (PHSZ)
-
-
- Pädagogische Hochschule Schwyz
-
- https://www.phsz.ch/en/
- 215
-
- 0
- PHSZ
- PHZ Schwyz
- 0000 0004 0613 7454
-
-
-
- grid.466169.a
-
-
- 22
- https://ror.org/05m37v666
- St.Gallen University of Teacher Education (PHSG)
-
-
- Pädagogische Hochschule St. Gallen
-
- https://www.phsg.ch/en
- 215
- 2007
- 0
- PHSG
-
- 0000 0001 0271 5139
-
-
- Q1768652
- grid.466208.e
-
-
- 23
- https://ror.org/00zg4za48
- Swiss Federal Institute for Vocational Educati...
-
- Institut Fédéral des Hautes Études en Formatio...
- Eidgenössisches Hochschulinstitut für Berufsbi...
-
- http://www.ehb-schweiz.ch/en/
- 215
- 2007
- 0
- SFIVET
-
- 0000 0001 2285 5681
-
-
- Q1302632
- grid.466173.1
-
-
- 24
- https://ror.org/03mcsbr76
- Swiss Ornithological Institute
-
-
- Schweizerische Vogelwarte
-
- http://www.vogelwarte.ch/de/home/
- 215
- 1924
- 0
-
-
- 0000 0001 1512 3677
-
-
- Q663638
- grid.419767.a
-
-
- 25
- https://ror.org/03c4atk17
- Universita della Svizzera Italiana (USI)
- University of Italian Switzerland
- Université de la suisse italienne
-
- Università della Svizzera italiana
- http://www.usi.ch/en/index.htm
- 215
- 1996
- 0
- USI
-
- 0000 0001 2203 2861
-
- 2290642
- Q689617
- grid.29078.34
-
-
- 26
- https://ror.org/04mq2g308
- University of Applied Sciences and Arts Northw...
-
-
-
-
- http://www.fhnw.ch/homepage
- 215
- 2006
- 0
- FHNW
- Fachhochschule Nordwestschweiz
- 0000 0001 1497 8091
-
-
-
- grid.410380.e
-
-
- 27
- https://ror.org/01xkakk17
- University of Applied Sciences and Arts Wester...
-
- Haute École Spécialisée de Suisse Occidentale
- Fachhochschule Westschweiz
-
- http://www.hes-so.ch/en/homepage-hes-so-1679.html
- 215
- 1998
- 0
- HES-SO
-
- 0000 0001 0943 1999
-
- 10128956
- Q168003
- grid.5681.a
-
-
- 28
- https://ror.org/05ep8g269
- University of Applied Sciences and Arts of Sou...
-
-
-
- Scuola Universitaria Professionale della Svizz...
- http://www.supsi.ch/home_en.html
- 215
- 1997
- 0
- SUPSI
-
- 0000000123252233
-
- 34066841
- Q663984
- grid.16058.3a
-
-
- 29
- https://ror.org/00w9q2c06
- University of Applied Sciences of Special Need...
-
-
- Interkantonale Hochschule für Heilpädagogik
-
- http://www.hfh.ch/en/
- 215
- 1924
- 0
- HfH
- Zurich Training College for Teachers of Specia...
- 0000 0001 0710 6332
-
-
-
- grid.466279.8
-
-
- 30
- https://ror.org/032ymzc07
- University of Applied Sciences of the Grisons
-
-
- Fachhochschule Graubünden
-
- https://www.fhgr.ch/en/
- 215
- 1963
- 0
-
- Hochschule für Technik und Wirtschaft Chur
- 0000 0000 8718 2812
-
-
- Q1622220
- grid.460104.7
-
-
- 31
- https://ror.org/02s6k3f65
- University of Basel
-
- Université de bâle
- Universität Basel
- Università di Basilea
- https://www.unibas.ch/de
- 215
- 1460
- 0
-
-
- 0000 0004 1937 0642
- 100008375
- 427614
- Q372608
- grid.6612.3
-
-
- 32
- https://ror.org/02k7v4d05
- University of Bern (UB)
-
- Université de Berne
- Universität Bern
- Università di Berna
- http://www.unibe.ch/eng/
- 215
- 1834
- 0
- UB
-
- 0000 0001 0726 5157
- 100009068
- 1157515
- Q659080
- grid.5734.5
-
-
- 33
- https://ror.org/022fs9h90
- University of Fribourg
-
- Université de Fribourg
- Universität Freiburg
- Università di Friburgo
- http://www.unifr.ch/home/welcomeE.php
- 215
- 1889
- 0
-
-
- 0000 0004 0478 1713
- 501100005869
- 535267
- Q36188
- grid.8534.a
-
-
- 34
- https://ror.org/019whta54
- University of Lausanne (UNIL)
-
- Université de Lausanne
- Universität Lausanne
- Università di Losanna
- http://www.unil.ch/central/en/home.html
- 215
- 1537
- 0
- UNIL
- Schola Lausannensis
- 0000 0001 2165 4204
- 501100006390
- 79810
- Q658975
- grid.9851.5
-
-
- 35
- https://ror.org/01qjrx392
- University of Liechtenstein
-
-
- Universität Liechtenstein
-
- https://www.uni.li/study/de/
- 128
- 1961
- 0
-
-
- 0000 0001 2227 4668
-
- 10554064
- Q974328
- grid.445905.9
-
-
- 36
- https://ror.org/00kgrkn83
- University of Lucerne (UNILU)
-
- Université de lucerne
- Universität Luzern
- Università di Lucerna
- http://www.unilu.ch/
- 215
- 2000
- 0
- UNILU
-
- 0000 0001 1456 7938
-
- 21004764
- Q673308
- grid.449852.6
-
-
- 37
- https://ror.org/00vasag41
- University of Neuchâtel
-
- Université de neuchâtel
- Universität Neuenburg
-
- http://www2.unine.ch/
- 215
- 1838
- 0
-
-
- 0000 0001 2297 7718
- 501100005353
- 3662101
- Q541548
- grid.10711.36
-
-
- 38
- https://ror.org/0561a3s31
- University of St. Gallen (HSG)
-
- Université de saint-gall
- Universität St. Gallen
- Università di San Gallo
- http://www.es.unisg.ch/en/
- 215
- 1898
- 0
- HSG
-
- 0000 0001 2156 6618
- 100009572
- 751473
- Q673354
- grid.15775.31
-
-
- 39
- https://ror.org/0235ynq74
- University of Teacher Education Lucerne
-
-
- Pädagogische Hochschule Luzern
-
- http://www.phlu.ch/ute-lucerne/
- 215
- 2003
- 0
-
- PH Luzern
- 0000 0001 0348 1637
-
-
-
- grid.465965.d
-
-
- 40
- https://ror.org/05ghhx264
- University of Teacher Education Zug (PH Zug)
-
-
- Pädagogische Hochschule Zug
-
- https://www.zg.ch/behoerden/direktion-fur-bild...
- 215
- 2013
- 0
- PH Zug
-
- 0000 0004 0449 2225
-
-
-
- grid.466274.5
-
-
- 41
- https://ror.org/02crff812
- University of Zurich (UZH)
-
- Université de zurich
- Universität Zürich
- Università di Zurigo
- http://www.uzh.ch/index_en.html
- 215
- 1833
- 0
- UZH
-
- 0000 0004 1937 0650
- 501100006447
- 314803
- Q206702
- grid.7400.3
-
-
- 42
- https://ror.org/05pmsvm27
- Zurich University of Applied Sciences (ZHAW)
-
-
- Zürcher Hochschule für Angewandte Wissenschaften
-
- https://www.zhaw.ch/en/university/
- 215
- 2007
- 0
- ZHAW
-
- 0000000122291644
-
- 30930550
- Q2605554
- grid.19739.35
-
-
- 43
- https://ror.org/02ejkey04
- Zurich University of Applied Sciences in Busin...
-
-
- Hochschule für Wirtschaft Zürich
-
- http://www.fh-hwz.ch/en
- 215
- 1986
- 0
- HWZ
-
- 0000 0001 0008 3713
-
- 30805829
- Q1488771
- grid.449909.9
-
-
- 44
- https://ror.org/01awgk221
- Zurich University of Teacher Education (PHZH)
-
-
- Pädagogische Hochschule Zürich
-
- https://phzh.ch/en/
- 215
- 2002
- 0
- PHZH
- PH Zürich
- 0000 0000 9666 1858
-
-
-
- grid.483054.e
-
-
- 45
- https://ror.org/05r0ap620
- Zurich University of the Arts
-
- Haute École d'Art de Zurich
- Zürcher Hochschule der Künste
-
- https://www.zhdk.ch/
- 215
- 2007
- 0
-
-
-
-
- 39250592
- Q222450
- grid.449912.3
-
-
-
-
-
-
-
-
-```python
-# ajout des funders
-organization = organization.append(organization_funders, ignore_index=True)
-organization
-```
-
-
-
-
-
-
-
-
-
-
- acronym
- aliases
- country
- fundref
- grid
- is_funder
- isni
- iso_code
- label_de
- label_en
- label_fr
- label_it
- name
- orgref
- ror
- sherpa_id
- starting_year
- website
- wikidata
-
-
-
-
- 0
- EPFL
-
- 215
- 501100001703
- grid.5333.6
- 0
- 0000000121839049
- NaN
-
- Swiss Federal Institute of Technology in Lausanne
-
-
- École Polytechnique Fédérale de Lausanne (EPFL)
- 71968
- https://ror.org/02s376052
- NaN
- 1853
- http://www.epfl.ch/index.en.html
- Q262760
-
-
- 1
- UNIGE
- Schola Genevensis
- 215
- 501100006389
- grid.8591.5
- 0
- 0000 0001 2322 4988
- NaN
-
-
- Université de Genève
- Università di Ginevra
- University of Geneva (UNIGE)
- 342348
- https://ror.org/01swzsf04
- NaN
- 1559
- https://www.unige.ch/
- Q503473
-
-
- 2
-
-
- 215
-
- grid.417771.3
- 0
- 0000 0004 4681 910X
- NaN
-
-
-
-
- Agroscope
-
- https://ror.org/04d8ztx87
- NaN
- 1850
- https://www.agroscope.admin.ch/agroscope/en/ho...
- Q397466
-
-
- 3
- BFH
-
- 215
- 501100006259
- grid.424060.4
- 0
- 0000 0001 0688 6779
- NaN
- Berner Fachhochschule
-
- Haute école spécialisée bernoise
-
- Bern University of Applied Sciences (BFH)
- 4365265
- https://ror.org/02bnkt322
- NaN
- 1997
- http://www.bfh.ch/en/home.html
- Q466455
-
-
- 4
- ETH Zurich
- Swiss Federal Institute of Technology in Zuric...
- 215
- 501100003006
- grid.5801.c
- 0
- 0000 0001 2156 2780
- NaN
- Eidgenössische Technische Hochschule Zürich
-
- École Polytechnique Fédérale de Zurich
- Politecnico federale di Zurigo
- ETH Zurich (ETH Zurich)
- 210910
- https://ror.org/05a28rw58
- NaN
- 1855
- https://www.ethz.ch/en.html
- Q11942
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 99
- NaN
- NaN
- 236
- http://dx.doi.org/10.13039/100000104
- NaN
- 1
- NaN
- US
- NaN
- NaN
- NaN
- NaN
- National Aeronautics and Space Administration ...
- NaN
- https://ror.org/027ka1x80
- 986.0
- NaN
- http://science.nasa.gov/
- NaN
-
-
- 100
- NaN
- NaN
- 236
- http://dx.doi.org/10.13039/100000001
- NaN
- 1
- NaN
- US
- NaN
- NaN
- NaN
- NaN
- National Science Foundation (NSF)
- NaN
- https://ror.org/021nxhr62
- 354.0
- NaN
- http://www.nsf.gov/
- NaN
-
-
- 101
- NaN
- NaN
- 234
- http://dx.doi.org/10.13039/501100000691
- NaN
- 1
- NaN
- GB
- NaN
- NaN
- NaN
- NaN
- Academy of Medical Science
- NaN
- https://ror.org/00c489v88
- 1125.0
- NaN
- https://acmedsci.ac.uk/
- NaN
-
-
- 102
- NaN
- NaN
- 234
- http://dx.doi.org/10.13039/501100000771
- NaN
- 1
- NaN
- GB
- NaN
- NaN
- NaN
- NaN
- Prostate Cancer UK
- NaN
- https://ror.org/04dkv6329
- 742.0
- NaN
- http://prostatecanceruk.org/
- NaN
-
-
- 103
- NaN
- NaN
- 215
- http://dx.doi.org/10.13039/501100001711
- NaN
- 1
- NaN
- CH
- NaN
- NaN
- NaN
- NaN
- Schweizerischer Nationalfonds zur Förderung de...
- NaN
- https://ror.org/00yjd3n13
- 25.0
- NaN
- http://www.snf.ch/de/Seiten/default.aspx
- NaN
-
-
-
-
104 rows × 19 columns
-
-
-
-
-
-```python
-# remplacement dans le fundref id qui renvoie vers du JSON seulement
-# URL actuel : http://data.crossref.org/fundingdata/funder/10.13039/[fundref id]
-# ex : http://dx.doi.org/10.13039/501100007903
-# redirigé sur : http://data.crossref.org/fundingdata/funder/10.13039/501100007903
-# URL des publications financées : https://search.crossref.org/funding?q=[fundref id]&from_ui=yes
-# ex : https://search.crossref.org/funding?q=501100003006&from_ui=yes
-organization['fundref'] = organization['fundref'].str.replace('http://dx.doi.org/10.13039/', '')
-organization
-```
-
-
-
-
-
-
-
-
-
-
- acronym
- aliases
- country
- fundref
- grid
- is_funder
- isni
- iso_code
- label_de
- label_en
- label_fr
- label_it
- name
- orgref
- ror
- sherpa_id
- starting_year
- website
- wikidata
-
-
-
-
- 0
- EPFL
-
- 215
- 501100001703
- grid.5333.6
- 0
- 0000000121839049
- NaN
-
- Swiss Federal Institute of Technology in Lausanne
-
-
- École Polytechnique Fédérale de Lausanne (EPFL)
- 71968
- https://ror.org/02s376052
- NaN
- 1853
- http://www.epfl.ch/index.en.html
- Q262760
-
-
- 1
- UNIGE
- Schola Genevensis
- 215
- 501100006389
- grid.8591.5
- 0
- 0000 0001 2322 4988
- NaN
-
-
- Université de Genève
- Università di Ginevra
- University of Geneva (UNIGE)
- 342348
- https://ror.org/01swzsf04
- NaN
- 1559
- https://www.unige.ch/
- Q503473
-
-
- 2
-
-
- 215
-
- grid.417771.3
- 0
- 0000 0004 4681 910X
- NaN
-
-
-
-
- Agroscope
-
- https://ror.org/04d8ztx87
- NaN
- 1850
- https://www.agroscope.admin.ch/agroscope/en/ho...
- Q397466
-
-
- 3
- BFH
-
- 215
- 501100006259
- grid.424060.4
- 0
- 0000 0001 0688 6779
- NaN
- Berner Fachhochschule
-
- Haute école spécialisée bernoise
-
- Bern University of Applied Sciences (BFH)
- 4365265
- https://ror.org/02bnkt322
- NaN
- 1997
- http://www.bfh.ch/en/home.html
- Q466455
-
-
- 4
- ETH Zurich
- Swiss Federal Institute of Technology in Zuric...
- 215
- 501100003006
- grid.5801.c
- 0
- 0000 0001 2156 2780
- NaN
- Eidgenössische Technische Hochschule Zürich
-
- École Polytechnique Fédérale de Zurich
- Politecnico federale di Zurigo
- ETH Zurich (ETH Zurich)
- 210910
- https://ror.org/05a28rw58
- NaN
- 1855
- https://www.ethz.ch/en.html
- Q11942
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 99
- NaN
- NaN
- 236
- 100000104
- NaN
- 1
- NaN
- US
- NaN
- NaN
- NaN
- NaN
- National Aeronautics and Space Administration ...
- NaN
- https://ror.org/027ka1x80
- 986.0
- NaN
- http://science.nasa.gov/
- NaN
-
-
- 100
- NaN
- NaN
- 236
- 100000001
- NaN
- 1
- NaN
- US
- NaN
- NaN
- NaN
- NaN
- National Science Foundation (NSF)
- NaN
- https://ror.org/021nxhr62
- 354.0
- NaN
- http://www.nsf.gov/
- NaN
-
-
- 101
- NaN
- NaN
- 234
- 501100000691
- NaN
- 1
- NaN
- GB
- NaN
- NaN
- NaN
- NaN
- Academy of Medical Science
- NaN
- https://ror.org/00c489v88
- 1125.0
- NaN
- https://acmedsci.ac.uk/
- NaN
-
-
- 102
- NaN
- NaN
- 234
- 501100000771
- NaN
- 1
- NaN
- GB
- NaN
- NaN
- NaN
- NaN
- Prostate Cancer UK
- NaN
- https://ror.org/04dkv6329
- 742.0
- NaN
- http://prostatecanceruk.org/
- NaN
-
-
- 103
- NaN
- NaN
- 215
- 501100001711
- NaN
- 1
- NaN
- CH
- NaN
- NaN
- NaN
- NaN
- Schweizerischer Nationalfonds zur Förderung de...
- NaN
- https://ror.org/00yjd3n13
- 25.0
- NaN
- http://www.snf.ch/de/Seiten/default.aspx
- NaN
-
-
-
-
104 rows × 19 columns
-
-
-
-
-
-```python
-# df pour l'export
-organization_export = organization[['name', 'website', 'country', 'starting_year', 'is_funder', 'ror', 'fundref']]
-organization_export
-```
-
-
-
-
-
-
-
-
-
-
- name
- website
- country
- starting_year
- is_funder
- ror
- fundref
-
-
-
-
- 0
- École Polytechnique Fédérale de Lausanne (EPFL)
- http://www.epfl.ch/index.en.html
- 215
- 1853
- 0
- https://ror.org/02s376052
- 501100001703
-
-
- 1
- University of Geneva (UNIGE)
- https://www.unige.ch/
- 215
- 1559
- 0
- https://ror.org/01swzsf04
- 501100006389
-
-
- 2
- Agroscope
- https://www.agroscope.admin.ch/agroscope/en/ho...
- 215
- 1850
- 0
- https://ror.org/04d8ztx87
-
-
-
- 3
- Bern University of Applied Sciences (BFH)
- http://www.bfh.ch/en/home.html
- 215
- 1997
- 0
- https://ror.org/02bnkt322
- 501100006259
-
-
- 4
- ETH Zurich (ETH Zurich)
- https://www.ethz.ch/en.html
- 215
- 1855
- 0
- https://ror.org/05a28rw58
- 501100003006
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 99
- National Aeronautics and Space Administration ...
- http://science.nasa.gov/
- 236
- NaN
- 1
- https://ror.org/027ka1x80
- 100000104
-
-
- 100
- National Science Foundation (NSF)
- http://www.nsf.gov/
- 236
- NaN
- 1
- https://ror.org/021nxhr62
- 100000001
-
-
- 101
- Academy of Medical Science
- https://acmedsci.ac.uk/
- 234
- NaN
- 1
- https://ror.org/00c489v88
- 501100000691
-
-
- 102
- Prostate Cancer UK
- http://prostatecanceruk.org/
- 234
- NaN
- 1
- https://ror.org/04dkv6329
- 501100000771
-
-
- 103
- Schweizerischer Nationalfonds zur Förderung de...
- http://www.snf.ch/de/Seiten/default.aspx
- 215
- NaN
- 1
- https://ror.org/00yjd3n13
- 501100001711
-
-
-
-
104 rows × 7 columns
-
-
-
-
-
-```python
-# ajout des valeurs vides
-organization_export['starting_year'] = organization_export['starting_year'].fillna(0)
-organization_export['fundref'] = organization_export['fundref'].fillna('')
-organization_export['ror'] = organization_export['ror'].fillna('')
-organization_export
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:3: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- This is separate from the ipykernel package so we can avoid doing imports until
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:4: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- after removing the cwd from sys.path.
-
-
-
-
-
-
-
-
-
-
-
- name
- website
- country
- starting_year
- is_funder
- ror
- fundref
-
-
-
-
- 0
- École Polytechnique Fédérale de Lausanne (EPFL)
- http://www.epfl.ch/index.en.html
- 215
- 1853
- 0
- https://ror.org/02s376052
- 501100001703
-
-
- 1
- University of Geneva (UNIGE)
- https://www.unige.ch/
- 215
- 1559
- 0
- https://ror.org/01swzsf04
- 501100006389
-
-
- 2
- Agroscope
- https://www.agroscope.admin.ch/agroscope/en/ho...
- 215
- 1850
- 0
- https://ror.org/04d8ztx87
-
-
-
- 3
- Bern University of Applied Sciences (BFH)
- http://www.bfh.ch/en/home.html
- 215
- 1997
- 0
- https://ror.org/02bnkt322
- 501100006259
-
-
- 4
- ETH Zurich (ETH Zurich)
- https://www.ethz.ch/en.html
- 215
- 1855
- 0
- https://ror.org/05a28rw58
- 501100003006
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 99
- National Aeronautics and Space Administration ...
- http://science.nasa.gov/
- 236
- 0
- 1
- https://ror.org/027ka1x80
- 100000104
-
-
- 100
- National Science Foundation (NSF)
- http://www.nsf.gov/
- 236
- 0
- 1
- https://ror.org/021nxhr62
- 100000001
-
-
- 101
- Academy of Medical Science
- https://acmedsci.ac.uk/
- 234
- 0
- 1
- https://ror.org/00c489v88
- 501100000691
-
-
- 102
- Prostate Cancer UK
- http://prostatecanceruk.org/
- 234
- 0
- 1
- https://ror.org/04dkv6329
- 501100000771
-
-
- 103
- Schweizerischer Nationalfonds zur Förderung de...
- http://www.snf.ch/de/Seiten/default.aspx
- 215
- 0
- 1
- https://ror.org/00yjd3n13
- 501100001711
-
-
-
-
104 rows × 7 columns
-
-
-
-
-
-```python
-# ajout de l'id avec l'index + 1
-organization_export['id'] = organization_export.index + 1
-# del terms_export_dedup['index']
-organization_export
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
-
-
-
-
-
-
-
-
-
-
-
- name
- website
- country
- starting_year
- is_funder
- ror
- fundref
- id
-
-
-
-
- 0
- École Polytechnique Fédérale de Lausanne (EPFL)
- http://www.epfl.ch/index.en.html
- 215
- 1853
- 0
- https://ror.org/02s376052
- 501100001703
- 1
-
-
- 1
- University of Geneva (UNIGE)
- https://www.unige.ch/
- 215
- 1559
- 0
- https://ror.org/01swzsf04
- 501100006389
- 2
-
-
- 2
- Agroscope
- https://www.agroscope.admin.ch/agroscope/en/ho...
- 215
- 1850
- 0
- https://ror.org/04d8ztx87
-
- 3
-
-
- 3
- Bern University of Applied Sciences (BFH)
- http://www.bfh.ch/en/home.html
- 215
- 1997
- 0
- https://ror.org/02bnkt322
- 501100006259
- 4
-
-
- 4
- ETH Zurich (ETH Zurich)
- https://www.ethz.ch/en.html
- 215
- 1855
- 0
- https://ror.org/05a28rw58
- 501100003006
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 99
- National Aeronautics and Space Administration ...
- http://science.nasa.gov/
- 236
- 0
- 1
- https://ror.org/027ka1x80
- 100000104
- 100
-
-
- 100
- National Science Foundation (NSF)
- http://www.nsf.gov/
- 236
- 0
- 1
- https://ror.org/021nxhr62
- 100000001
- 101
-
-
- 101
- Academy of Medical Science
- https://acmedsci.ac.uk/
- 234
- 0
- 1
- https://ror.org/00c489v88
- 501100000691
- 102
-
-
- 102
- Prostate Cancer UK
- http://prostatecanceruk.org/
- 234
- 0
- 1
- https://ror.org/04dkv6329
- 501100000771
- 103
-
-
- 103
- Schweizerischer Nationalfonds zur Förderung de...
- http://www.snf.ch/de/Seiten/default.aspx
- 215
- 0
- 1
- https://ror.org/00yjd3n13
- 501100001711
- 104
-
-
-
-
104 rows × 8 columns
-
-
-
-
-
-```python
-# export de la table
-result = organization_export.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/organization.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export excel
-organization_export.to_excel('sample/organization.xlsx', index=False)
-```
-
-
-```python
-# export csv
-organization_export.to_csv('sample/organization.tsv', index=False)
-```
-
-## Table condition_set_term
-
-
-```python
-term_orig
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
- id_content_hash
- id_content_hash_licence
- ir_archiving
-
-
-
-
- 0
- 1.0
- 1
- 999999
- 0
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
-
-
- 1
- 2.0
- 2
- 999999
- 12
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -1187146317861229577
- 1080785657261440835
- 1
-
-
- 2
- 3.0
- 3
- 355
- 0
- True
- 1
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- -6827815856646016670
- -4410614044147247907
- 1
-
-
- 3
- 4.0
- 3
- 355
- 0
- True
- 2
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- 5388365857945903435
- -492868609330074007
- 1
-
-
- 4
- 5.0
- 1
- 999999
- 0
- False
- 999999
- 498.0
- NaN
- NaN
- Non institutional archiving locations: ChemRxi...
- NaN
- NaN
- NaN
- -2781821769548802966
- 935766765288137110
- 0
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 48673
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/01swzsf04
- Cambridge University Press (CUP) Read & Publis...
- 40079.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
-
-
- 48674
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/019whta54
- Cambridge University Press (CUP) Read & Publis...
- 40080.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
-
-
- 48675
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/00vasag41
- Cambridge University Press (CUP) Read & Publis...
- 40081.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
-
-
- 48676
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05r0ap620
- Cambridge University Press (CUP) Read & Publis...
- 40082.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
-
-
- 48677
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05pmsvm27
- Cambridge University Press (CUP) Read & Publis...
- 40083.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
-
-
-
-
48678 rows × 16 columns
-
-
-
-
-
-```python
-terms_export_dedup
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
- id
- source
-
-
-
-
- 0
- 1.0
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 999999
- 0
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 1
-
-
-
- 1
- 2.0
- NaN
- -1187146317861229577
- 1080785657261440835
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 2
-
-
-
- 2
- 3.0
- NaN
- -6827815856646016670
- -4410614044147247907
- 3
- 355
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
- 3
-
-
-
- 3
- 4.0
- NaN
- 5388365857945903435
- -492868609330074007
- 3
- 355
- 0
- 1
- 2
- Institutional archiving locations: Any Website...
- 4
-
-
-
- 4
- 5.0
- NaN
- -2781821769548802966
- 935766765288137110
- 1
- 999999
- 0
- 0
- 999999
- Non institutional archiving locations: ChemRxi...
- 5
-
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1315
- NaN
- 1.0
- -6020029623494903364
- -5435886237991661497
- 3
- 581
- 0
- 1
- 1
- Elsevier Read & Publish agreement
- 1316
-
-
-
- 1316
- NaN
- 18129.0
- -1955262099488276438
- 6359482801433181261
- 3
- 581
- 0
- 1
- 1
- NaN
- 1317
-
-
-
- 1317
- NaN
- 24845.0
- -681455397323083870
- 5265079689140421989
- 3
- 581
- 0
- 1
- 1
- Wiley Read & Publish agreement
- 1318
-
-
-
- 1318
- NaN
- 38750.0
- 6747956201225830719
- -4648758608429098534
- 3
- 581
- 0
- 1
- 1
- Taylor and Francis Read & Publish agreement
- 1319
-
-
-
- 1319
- NaN
- 39164.0
- 7687377827846095855
- 2298488065455407402
- 3
- 581
- 60
- 1
- 1
- Cambridge University Press (CUP) Read & Publis...
- 1320
-
-
-
-
-
1320 rows × 12 columns
-
-
-
-
-
-```python
-# merge des terms id
-term_orig = pd.merge(term_orig, terms_export_dedup[['id_content_hash', 'id']], on='id_content_hash', how='left')
-term_orig
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
- id_content_hash
- id_content_hash_licence
- ir_archiving
- id
-
-
-
-
- 0
- 1.0
- 1
- 999999
- 0
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 1
-
-
- 1
- 2.0
- 2
- 999999
- 12
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -1187146317861229577
- 1080785657261440835
- 1
- 2
-
-
- 2
- 3.0
- 3
- 355
- 0
- True
- 1
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- -6827815856646016670
- -4410614044147247907
- 1
- 3
-
-
- 3
- 4.0
- 3
- 355
- 0
- True
- 2
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- 5388365857945903435
- -492868609330074007
- 1
- 4
-
-
- 4
- 5.0
- 1
- 999999
- 0
- False
- 999999
- 498.0
- NaN
- NaN
- Non institutional archiving locations: ChemRxi...
- NaN
- NaN
- NaN
- -2781821769548802966
- 935766765288137110
- 0
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 48673
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/01swzsf04
- Cambridge University Press (CUP) Read & Publis...
- 40079.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
-
-
- 48674
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/019whta54
- Cambridge University Press (CUP) Read & Publis...
- 40080.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
-
-
- 48675
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/00vasag41
- Cambridge University Press (CUP) Read & Publis...
- 40081.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
-
-
- 48676
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05r0ap620
- Cambridge University Press (CUP) Read & Publis...
- 40082.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
-
-
- 48677
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05pmsvm27
- Cambridge University Press (CUP) Read & Publis...
- 40083.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
-
-
-
-
48678 rows × 17 columns
-
-
-
-
-
-```python
-term_orig = term_orig.rename(columns = {'id' : 'term'})
-term_orig
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
- id_content_hash
- id_content_hash_licence
- ir_archiving
- term
-
-
-
-
- 0
- 1.0
- 1
- 999999
- 0
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 1
-
-
- 1
- 2.0
- 2
- 999999
- 12
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -1187146317861229577
- 1080785657261440835
- 1
- 2
-
-
- 2
- 3.0
- 3
- 355
- 0
- True
- 1
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- -6827815856646016670
- -4410614044147247907
- 1
- 3
-
-
- 3
- 4.0
- 3
- 355
- 0
- True
- 2
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- 5388365857945903435
- -492868609330074007
- 1
- 4
-
-
- 4
- 5.0
- 1
- 999999
- 0
- False
- 999999
- 498.0
- NaN
- NaN
- Non institutional archiving locations: ChemRxi...
- NaN
- NaN
- NaN
- -2781821769548802966
- 935766765288137110
- 0
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 48673
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/01swzsf04
- Cambridge University Press (CUP) Read & Publis...
- 40079.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
-
-
- 48674
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/019whta54
- Cambridge University Press (CUP) Read & Publis...
- 40080.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
-
-
- 48675
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/00vasag41
- Cambridge University Press (CUP) Read & Publis...
- 40081.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
-
-
- 48676
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05r0ap620
- Cambridge University Press (CUP) Read & Publis...
- 40082.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
-
-
- 48677
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05pmsvm27
- Cambridge University Press (CUP) Read & Publis...
- 40083.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
-
-
-
-
48678 rows × 17 columns
-
-
-
-
-
-```python
-condition_type
-```
-
-
-
-
-
-
-
-
-
-
- id
- condition_issuer
-
-
-
-
- 0
- 1
- Journal-only
-
-
- 1
- 2
- Organization-only
-
-
- 2
- 3
- Journal-organization agreement
-
-
-
-
-
-
-
-
-```python
-# merge des condition type
-term_orig['condition_type'] = 3
-term_orig.loc[term_orig['ror'].isna(), 'condition_type'] = 1
-term_orig
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
- id_content_hash
- id_content_hash_licence
- ir_archiving
- term
- condition_type
-
-
-
-
- 0
- 1.0
- 1
- 999999
- 0
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 1
- 1
-
-
- 1
- 2.0
- 2
- 999999
- 12
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -1187146317861229577
- 1080785657261440835
- 1
- 2
- 1
-
-
- 2
- 3.0
- 3
- 355
- 0
- True
- 1
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- -6827815856646016670
- -4410614044147247907
- 1
- 3
- 1
-
-
- 3
- 4.0
- 3
- 355
- 0
- True
- 2
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- 5388365857945903435
- -492868609330074007
- 1
- 4
- 1
-
-
- 4
- 5.0
- 1
- 999999
- 0
- False
- 999999
- 498.0
- NaN
- NaN
- Non institutional archiving locations: ChemRxi...
- NaN
- NaN
- NaN
- -2781821769548802966
- 935766765288137110
- 0
- 5
- 1
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 48673
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/01swzsf04
- Cambridge University Press (CUP) Read & Publis...
- 40079.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
-
-
- 48674
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/019whta54
- Cambridge University Press (CUP) Read & Publis...
- 40080.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
-
-
- 48675
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/00vasag41
- Cambridge University Press (CUP) Read & Publis...
- 40081.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
-
-
- 48676
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05r0ap620
- Cambridge University Press (CUP) Read & Publis...
- 40082.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
-
-
- 48677
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05pmsvm27
- Cambridge University Press (CUP) Read & Publis...
- 40083.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
-
-
-
-
48678 rows × 18 columns
-
-
-
-
-
-```python
-organization_export
-```
-
-
-
-
-
-
-
-
-
-
- name
- website
- country
- starting_year
- is_funder
- ror
- fundref
- id
-
-
-
-
- 0
- École Polytechnique Fédérale de Lausanne (EPFL)
- http://www.epfl.ch/index.en.html
- 215
- 1853
- 0
- https://ror.org/02s376052
- 501100001703
- 1
-
-
- 1
- University of Geneva (UNIGE)
- https://www.unige.ch/
- 215
- 1559
- 0
- https://ror.org/01swzsf04
- 501100006389
- 2
-
-
- 2
- Agroscope
- https://www.agroscope.admin.ch/agroscope/en/ho...
- 215
- 1850
- 0
- https://ror.org/04d8ztx87
-
- 3
-
-
- 3
- Bern University of Applied Sciences (BFH)
- http://www.bfh.ch/en/home.html
- 215
- 1997
- 0
- https://ror.org/02bnkt322
- 501100006259
- 4
-
-
- 4
- ETH Zurich (ETH Zurich)
- https://www.ethz.ch/en.html
- 215
- 1855
- 0
- https://ror.org/05a28rw58
- 501100003006
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 99
- National Aeronautics and Space Administration ...
- http://science.nasa.gov/
- 236
- 0
- 1
- https://ror.org/027ka1x80
- 100000104
- 100
-
-
- 100
- National Science Foundation (NSF)
- http://www.nsf.gov/
- 236
- 0
- 1
- https://ror.org/021nxhr62
- 100000001
- 101
-
-
- 101
- Academy of Medical Science
- https://acmedsci.ac.uk/
- 234
- 0
- 1
- https://ror.org/00c489v88
- 501100000691
- 102
-
-
- 102
- Prostate Cancer UK
- http://prostatecanceruk.org/
- 234
- 0
- 1
- https://ror.org/04dkv6329
- 501100000771
- 103
-
-
- 103
- Schweizerischer Nationalfonds zur Förderung de...
- http://www.snf.ch/de/Seiten/default.aspx
- 215
- 0
- 1
- https://ror.org/00yjd3n13
- 501100001711
- 104
-
-
-
-
104 rows × 8 columns
-
-
-
-
-
-```python
-# merge des organizations
-term_orig = pd.merge(term_orig, organization_export[['ror', 'id']], on='ror', how='left')
-term_orig
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
- id_content_hash
- id_content_hash_licence
- ir_archiving
- term
- condition_type
- id
-
-
-
-
- 0
- 1.0
- 1
- 999999
- 0
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 1
- 1
- NaN
-
-
- 1
- 2.0
- 2
- 999999
- 12
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -1187146317861229577
- 1080785657261440835
- 1
- 2
- 1
- NaN
-
-
- 2
- 3.0
- 3
- 355
- 0
- True
- 1
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- -6827815856646016670
- -4410614044147247907
- 1
- 3
- 1
- NaN
-
-
- 3
- 4.0
- 3
- 355
- 0
- True
- 2
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- 5388365857945903435
- -492868609330074007
- 1
- 4
- 1
- NaN
-
-
- 4
- 5.0
- 1
- 999999
- 0
- False
- 999999
- 498.0
- NaN
- NaN
- Non institutional archiving locations: ChemRxi...
- NaN
- NaN
- NaN
- -2781821769548802966
- 935766765288137110
- 0
- 5
- 1
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 48673
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/01swzsf04
- Cambridge University Press (CUP) Read & Publis...
- 40079.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
- 2.0
-
-
- 48674
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/019whta54
- Cambridge University Press (CUP) Read & Publis...
- 40080.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
- 35.0
-
-
- 48675
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/00vasag41
- Cambridge University Press (CUP) Read & Publis...
- 40081.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
- 38.0
-
-
- 48676
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05r0ap620
- Cambridge University Press (CUP) Read & Publis...
- 40082.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
- 46.0
-
-
- 48677
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05pmsvm27
- Cambridge University Press (CUP) Read & Publis...
- 40083.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
- 43.0
-
-
-
-
48678 rows × 19 columns
-
-
-
-
-
-```python
-term_orig = term_orig.rename(columns = {'id' : 'organization'})
-term_orig
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- version
- cost_factor
- embargo_months
- archiving
- licence
- journal
- prerequisite_funders
- ror
- comment
- rp_id
- valid_from
- valid_until
- id_content_hash
- id_content_hash_licence
- ir_archiving
- term
- condition_type
- organization
-
-
-
-
- 0
- 1.0
- 1
- 999999
- 0
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 1
- 1
- NaN
-
-
- 1
- 2.0
- 2
- 999999
- 12
- True
- 999999
- 532.0
- NaN
- NaN
- Institutional archiving locations: Non-Commerc...
- NaN
- NaN
- NaN
- -1187146317861229577
- 1080785657261440835
- 1
- 2
- 1
- NaN
-
-
- 2
- 3.0
- 3
- 355
- 0
- True
- 1
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- -6827815856646016670
- -4410614044147247907
- 1
- 3
- 1
- NaN
-
-
- 3
- 4.0
- 3
- 355
- 0
- True
- 2
- 532.0
- NaN
- NaN
- Institutional archiving locations: Any Website...
- NaN
- NaN
- NaN
- 5388365857945903435
- -492868609330074007
- 1
- 4
- 1
- NaN
-
-
- 4
- 5.0
- 1
- 999999
- 0
- False
- 999999
- 498.0
- NaN
- NaN
- Non institutional archiving locations: ChemRxi...
- NaN
- NaN
- NaN
- -2781821769548802966
- 935766765288137110
- 0
- 5
- 1
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 48673
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/01swzsf04
- Cambridge University Press (CUP) Read & Publis...
- 40079.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
- 2.0
-
-
- 48674
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/019whta54
- Cambridge University Press (CUP) Read & Publis...
- 40080.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
- 35.0
-
-
- 48675
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/00vasag41
- Cambridge University Press (CUP) Read & Publis...
- 40081.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
- 38.0
-
-
- 48676
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05r0ap620
- Cambridge University Press (CUP) Read & Publis...
- 40082.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
- 46.0
-
-
- 48677
- NaN
- 3
- 581
- 60
- True
- 5
- 592.0
- NaN
- https://ror.org/05pmsvm27
- Cambridge University Press (CUP) Read & Publis...
- 40083.0
- 2021-01-01
- 2023-12-31
- 7687377827846095855
- 2298495942200956358
- 1
- 1320
- 3
- 43.0
-
-
-
-
48678 rows × 19 columns
-
-
-
-
-
-```python
-# concat valeurs avec même id
-condition_set_term_dedup_terms = term_orig[['term', 'id_content_hash']]
-condition_set_term_dedup_terms_dedup = condition_set_term_dedup_terms.drop_duplicates()
-condition_set_term_dedup_terms_dedup = condition_set_term_dedup_terms_dedup.loc[condition_set_term_dedup_terms_dedup['term'].notna()]
-condition_set_term_dedup_terms_dedup['term'] = condition_set_term_dedup_terms_dedup['term'].astype(int)
-condition_set_term_dedup_terms_dedup['term'] = condition_set_term_dedup_terms_dedup['term'].astype(str)
-condition_set_term_dedup_terms_dedup = condition_set_term_dedup_terms_dedup.groupby('id_content_hash').agg({'term': lambda x: ', '.join(x)})
-condition_set_term_dedup_terms_dedup
-```
-
-
-
-
-
-
-
-
-
-
- term
-
-
- id_content_hash
-
-
-
-
-
- -9213354388875732238
- 271
-
-
- -9200070744422558377
- 1039
-
-
- -9171783117023104395
- 1175
-
-
- -9134952646468948163
- 1283
-
-
- -9133013648751406289
- 1106
-
-
- ...
- ...
-
-
- 9195001330432352893
- 1103
-
-
- 9200466168345981543
- 250
-
-
- 9213878808178729253
- 580
-
-
- 9218389208912777882
- 38
-
-
- 9219045216097074691
- 919
-
-
-
-
1320 rows × 1 columns
-
-
-
-
-
-```python
-# concat valeurs avec même id
-condition_set_term_dedup_journals = term_orig[['journal', 'id_content_hash']]
-condition_set_term_dedup_journals_dedup = condition_set_term_dedup_journals.drop_duplicates()
-condition_set_term_dedup_journals_dedup = condition_set_term_dedup_journals_dedup.loc[condition_set_term_dedup_journals_dedup['journal'].notna()]
-condition_set_term_dedup_journals_dedup['journal'] = condition_set_term_dedup_journals_dedup['journal'].astype(int)
-condition_set_term_dedup_journals_dedup['journal'] = condition_set_term_dedup_journals_dedup['journal'].astype(str)
-condition_set_term_dedup_journals_dedup = condition_set_term_dedup_journals_dedup.groupby('id_content_hash').agg({'journal': lambda x: ', '.join(x)})
-condition_set_term_dedup_journals_dedup
-```
-
-
-
-
-
-
-
-
-
-
- journal
-
-
- id_content_hash
-
-
-
-
-
- -9213354388875732238
- 342, 219, 18, 918, 309, 543, 642, 27, 246, 64,...
-
-
- -9200070744422558377
- 427
-
-
- -9171783117023104395
- 548, 240, 298, 132, 3, 516
-
-
- -9134952646468948163
- 990
-
-
- -9133013648751406289
- 366
-
-
- ...
- ...
-
-
- 9195001330432352893
- 687
-
-
- 9200466168345981543
- 230
-
-
- 9213878808178729253
- 722
-
-
- 9218389208912777882
- 199
-
-
- 9219045216097074691
- 190
-
-
-
-
1320 rows × 1 columns
-
-
-
-
-
-```python
-# concat valeurs avec même id
-condition_set_term_dedup_organizations = term_orig[['organization', 'id_content_hash']]
-condition_set_term_dedup_organizations_dedup = condition_set_term_dedup_organizations.drop_duplicates()
-condition_set_term_dedup_organizations_dedup = condition_set_term_dedup_organizations_dedup.loc[condition_set_term_dedup_organizations_dedup['organization'].notna()]
-condition_set_term_dedup_organizations_dedup['organization'] = condition_set_term_dedup_organizations_dedup['organization'].astype(int)
-condition_set_term_dedup_organizations_dedup['organization'] = condition_set_term_dedup_organizations_dedup['organization'].astype(str)
-condition_set_term_dedup_organizations_dedup = condition_set_term_dedup_organizations_dedup.groupby('id_content_hash').agg({'organization': lambda x: ', '.join(x)})
-condition_set_term_dedup_organizations_dedup
-```
-
-
-
-
-
-
-
-
-
-
- organization
-
-
- id_content_hash
-
-
-
-
-
- -9213354388875732238
- 75, 76, 77, 78
-
-
- -9200070744422558377
- 47
-
-
- -9134952646468948163
- 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59...
-
-
- -9133013648751406289
- 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59...
-
-
- -9085129519950455938
- 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59...
-
-
- ...
- ...
-
-
- 8745253383893524719
- 48, 64, 51, 74, 68, 67, 69, 59
-
-
- 8913401298465203811
- 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59...
-
-
- 8999447149908101495
- 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59...
-
-
- 9195001330432352893
- 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59...
-
-
- 9219045216097074691
- 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59...
-
-
-
-
277 rows × 1 columns
-
-
-
-
-
-```python
-# concat valeurs avec même id : pas possible pour condition_type
-condition_set_term_dedup_condition_types = term_orig[['condition_type', 'id_content_hash']]
-condition_set_term_dedup_condition_types_dedup = condition_set_term_dedup_condition_types.drop_duplicates()
-condition_set_term_dedup_condition_types_dedup = condition_set_term_dedup_condition_types_dedup.loc[condition_set_term_dedup_condition_types_dedup['condition_type'].notna()]
-# condition_set_term_dedup_condition_types_dedup['condition_type'] = condition_set_term_dedup_condition_types_dedup['condition_type'].astype(int)
-# condition_set_term_dedup_condition_types_dedup['condition_type'] = condition_set_term_dedup_condition_types_dedup['condition_type'].astype(str)
-# condition_set_term_dedup_condition_types_dedup = condition_set_term_dedup_condition_types_dedup.groupby('id_content_hash').agg({'condition_type': lambda x: ', '.join(x)})
-condition_set_term_dedup_condition_types_dedup
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- id_content_hash
-
-
-
-
- 0
- 1
- -5068777248818105392
-
-
- 1
- 1
- -1187146317861229577
-
-
- 2
- 1
- -6827815856646016670
-
-
- 3
- 1
- 5388365857945903435
-
-
- 4
- 1
- -2781821769548802966
-
-
- ...
- ...
- ...
-
-
- 33439
- 3
- -681455397323083870
-
-
- 47344
- 3
- 6747956201225830719
-
-
- 47362
- 1
- 6747956201225830719
-
-
- 47758
- 3
- 7687377827846095855
-
-
- 47776
- 1
- 7687377827846095855
-
-
-
-
1533 rows × 2 columns
-
-
-
-
-
-```python
-# recuperation des ids groupés
-terms_export_dedup = pd.merge(terms_export_dedup, condition_set_term_dedup_terms_dedup, on='id_content_hash', how='left')
-terms_export_dedup = pd.merge(terms_export_dedup, condition_set_term_dedup_journals_dedup, on='id_content_hash', how='left')
-terms_export_dedup = pd.merge(terms_export_dedup, condition_set_term_dedup_organizations_dedup, on='id_content_hash', how='left')
-terms_export_dedup = pd.merge(terms_export_dedup, condition_set_term_dedup_condition_types_dedup, on='id_content_hash', how='left')
-terms_export_dedup
-```
-
-
-
-
-
-
-
-
-
-
- id_sherpa
- rp_id
- id_content_hash
- id_content_hash_licence
- version
- cost_factor
- embargo_months
- ir_archiving
- licence
- comment
- id
- source
- term
- journal
- organization
- condition_type
-
-
-
-
- 0
- 1.0
- NaN
- -5068777248818105392
- -8194612545168817012
- 1
- 999999
- 0
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 1
-
- 1
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- NaN
- 1
-
-
- 1
- 2.0
- NaN
- -1187146317861229577
- 1080785657261440835
- 2
- 999999
- 12
- 1
- 999999
- Institutional archiving locations: Non-Commerc...
- 2
-
- 2
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- NaN
- 1
-
-
- 2
- 3.0
- NaN
- -6827815856646016670
- -4410614044147247907
- 3
- 355
- 0
- 1
- 1
- Institutional archiving locations: Any Website...
- 3
-
- 3
- 532
- NaN
- 1
-
-
- 3
- 4.0
- NaN
- 5388365857945903435
- -492868609330074007
- 3
- 355
- 0
- 1
- 2
- Institutional archiving locations: Any Website...
- 4
-
- 4
- 532
- NaN
- 1
-
-
- 4
- 5.0
- NaN
- -2781821769548802966
- 935766765288137110
- 1
- 999999
- 0
- 0
- 999999
- Non institutional archiving locations: ChemRxi...
- 5
-
- 5
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
- NaN
- 1
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1528
- NaN
- 24845.0
- -681455397323083870
- 5265079689140421989
- 3
- 581
- 0
- 1
- 1
- Wiley Read & Publish agreement
- 1318
-
- 1318
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 3
-
-
- 1529
- NaN
- 38750.0
- 6747956201225830719
- -4648758608429098534
- 3
- 581
- 0
- 1
- 1
- Taylor and Francis Read & Publish agreement
- 1319
-
- 1319
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 3
-
-
- 1530
- NaN
- 38750.0
- 6747956201225830719
- -4648758608429098534
- 3
- 581
- 0
- 1
- 1
- Taylor and Francis Read & Publish agreement
- 1319
-
- 1319
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1
-
-
- 1531
- NaN
- 39164.0
- 7687377827846095855
- 2298488065455407402
- 3
- 581
- 60
- 1
- 1
- Cambridge University Press (CUP) Read & Publis...
- 1320
-
- 1320
- 866, 171, 186, 839, 592
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 3
-
-
- 1532
- NaN
- 39164.0
- 7687377827846095855
- 2298488065455407402
- 3
- 581
- 60
- 1
- 1
- Cambridge University Press (CUP) Read & Publis...
- 1320
-
- 1320
- 866, 171, 186, 839, 592
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1
-
-
-
-
1533 rows × 16 columns
-
-
-
-
-
-```python
-condition_sets_orig = terms_export_dedup[['term', 'condition_type', 'organization', 'journal']]
-condition_sets_orig
-```
-
-
-
-
-
-
-
-
-
-
- term
- condition_type
- organization
- journal
-
-
-
-
- 0
- 1
- 1
- NaN
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
-
-
- 1
- 2
- 1
- NaN
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
-
-
- 2
- 3
- 1
- NaN
- 532
-
-
- 3
- 4
- 1
- NaN
- 532
-
-
- 4
- 5
- 1
- NaN
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1528
- 1318
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
-
-
- 1529
- 1319
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
-
-
- 1530
- 1319
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
-
-
- 1531
- 1320
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
-
-
- 1532
- 1320
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
-
-
-
-
1533 rows × 4 columns
-
-
-
-
-
-```python
-# ajout d'un hash unique pour chaque variante
-condition_sets_orig['id_term_hash'] = condition_sets_orig.apply(lambda x: hash(tuple(x[['condition_type', 'organization', 'journal']])), axis = 1)
-condition_sets_orig
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
-
-
-
-
-
-
-
-
-
-
-
- term
- condition_type
- organization
- journal
- id_term_hash
-
-
-
-
- 0
- 1
- 1
- NaN
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
-
-
- 1
- 2
- 1
- NaN
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
-
-
- 2
- 3
- 1
- NaN
- 532
- -3428409893954144223
-
-
- 3
- 4
- 1
- NaN
- 532
- -3428409893954144223
-
-
- 4
- 5
- 1
- NaN
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
- 5362274893926121442
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1528
- 1318
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
- -32115995447722756
-
-
- 1529
- 1319
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 4789694892756018439
-
-
- 1530
- 1319
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 7722626036678389533
-
-
- 1531
- 1320
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 6902392350219571553
-
-
- 1532
- 1320
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 4611302665250055299
-
-
-
-
1533 rows × 5 columns
-
-
-
-
-
-```python
-# grouper les termes qui ont les mêmes valeurs pour le reste
-condition_sets_orig_terms = condition_sets_orig[['term', 'id_term_hash']]
-condition_sets_orig_terms_dedup = condition_sets_orig_terms.drop_duplicates()
-condition_sets_orig_terms_dedup = condition_sets_orig_terms_dedup.loc[condition_sets_orig_terms_dedup['term'].notna()]
-condition_sets_orig_terms_dedup['term'] = condition_sets_orig_terms_dedup['term'].astype(int)
-condition_sets_orig_terms_dedup['term'] = condition_sets_orig_terms_dedup['term'].astype(str)
-condition_sets_orig_terms_dedup = condition_sets_orig_terms_dedup.groupby('id_term_hash').agg({'term': lambda x: ', '.join(x)})
-condition_sets_orig_terms_dedup
-```
-
-
-
-
-
-
-
-
-
-
- term
-
-
- id_term_hash
-
-
-
-
-
- -9221122160312283608
- 796
-
-
- -9194263828544732083
- 812
-
-
- -9192944961126408089
- 1246
-
-
- -9191653994283170820
- 965
-
-
- -9180782299480364441
- 1185
-
-
- ...
- ...
-
-
- 9197647807999611822
- 421
-
-
- 9200686802301911565
- 359
-
-
- 9203218741230767213
- 1056
-
-
- 9211734360905731286
- 630, 631
-
-
- 9214772761176685077
- 706
-
-
-
-
1149 rows × 1 columns
-
-
-
-
-
-```python
-# ajout des ids groupées
-condition_sets_orig_terms = pd.merge(condition_sets_orig, condition_sets_orig_terms_dedup, on='id_term_hash', how='left')
-condition_sets_orig_terms
-```
-
-
-
-
-
-
-
-
-
-
- term_x
- condition_type
- organization
- journal
- id_term_hash
- term_y
-
-
-
-
- 0
- 1
- 1
- NaN
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
- 1, 2
-
-
- 1
- 2
- 1
- NaN
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
- 1, 2
-
-
- 2
- 3
- 1
- NaN
- 532
- -3428409893954144223
- 3, 4
-
-
- 3
- 4
- 1
- NaN
- 532
- -3428409893954144223
- 3, 4
-
-
- 4
- 5
- 1
- NaN
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
- 5362274893926121442
- 5, 6
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1528
- 1318
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
- -32115995447722756
- 1318
-
-
- 1529
- 1319
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 4789694892756018439
- 1319
-
-
- 1530
- 1319
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 7722626036678389533
- 1319
-
-
- 1531
- 1320
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 6902392350219571553
- 1320
-
-
- 1532
- 1320
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 4611302665250055299
- 1320
-
-
-
-
1533 rows × 6 columns
-
-
-
-
-
-```python
-# rename terms
-del condition_sets_orig_terms['term_x']
-condition_sets_orig_terms = condition_sets_orig_terms.rename(columns = {'term_y' : 'term'})
-condition_sets_orig_terms
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
-
-
-
-
- 0
- 1
- NaN
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
- 1, 2
-
-
- 1
- 1
- NaN
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
- 1, 2
-
-
- 2
- 1
- NaN
- 532
- -3428409893954144223
- 3, 4
-
-
- 3
- 1
- NaN
- 532
- -3428409893954144223
- 3, 4
-
-
- 4
- 1
- NaN
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
- 5362274893926121442
- 5, 6
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1528
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
- -32115995447722756
- 1318
-
-
- 1529
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 4789694892756018439
- 1319
-
-
- 1530
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 7722626036678389533
- 1319
-
-
- 1531
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 6902392350219571553
- 1320
-
-
- 1532
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 4611302665250055299
- 1320
-
-
-
-
1533 rows × 5 columns
-
-
-
-
-
-```python
-# test duplicates
-condition_sets_orig_terms.loc[condition_sets_orig_terms.duplicated()].sort_values(by='term')
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
-
-
-
-
- 1
- 1
- NaN
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
- 1, 2
-
-
- 1187
- 1
- NaN
- 779
- -9104022108665859378
- 1001, 1002, 1003
-
-
- 1188
- 1
- NaN
- 779
- -9104022108665859378
- 1001, 1002, 1003
-
-
- 1190
- 1
- NaN
- 7, 22
- -5795971402582868051
- 1004, 1005
-
-
- 1194
- 1
- NaN
- 825
- -2985725204066841336
- 1008, 1009
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1161
- 1
- NaN
- 855
- 158530994336307876
- 978, 979
-
-
- 1168
- 1
- NaN
- 654
- -5164377982436891368
- 984, 985
-
-
- 1179
- 1
- NaN
- 751
- -1857992192228010123
- 993, 994, 995
-
-
- 1180
- 1
- NaN
- 751
- -1857992192228010123
- 993, 994, 995
-
-
- 1182
- 1
- NaN
- 531
- -3353627437951234546
- 996, 997
-
-
-
-
384 rows × 5 columns
-
-
-
-
-
-```python
-condition_sets_orig_terms.loc[condition_sets_orig_terms.duplicated()].shape[0]
-```
-
-
-
-
- 384
-
-
-
-
-```python
-condition_sets_orig_terms_dedup = condition_sets_orig_terms.drop_duplicates()
-condition_sets_orig_terms_dedup
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
-
-
-
-
- 0
- 1
- NaN
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
- 1, 2
-
-
- 2
- 1
- NaN
- 532
- -3428409893954144223
- 3, 4
-
-
- 4
- 1
- NaN
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
- 5362274893926121442
- 5, 6
-
-
- 6
- 1
- NaN
- 498
- -713947468848485257
- 7, 8
-
-
- 8
- 1
- NaN
- 789
- -5332045039572836456
- 9, 10, 11, 12
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1528
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
- -32115995447722756
- 1318
-
-
- 1529
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 4789694892756018439
- 1319
-
-
- 1530
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 7722626036678389533
- 1319
-
-
- 1531
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 6902392350219571553
- 1320
-
-
- 1532
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 4611302665250055299
- 1320
-
-
-
-
1149 rows × 5 columns
-
-
-
-
-
-```python
-# ajout des champs manquants
-condition_sets_orig_terms_dedup['comment'] = ''
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
-
-
-
-```python
-# remplacement des "nan"
-condition_sets_orig_terms_dedup.loc[condition_sets_orig_terms_dedup['journal'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
- comment
-
-
-
-
-
-
-
-
-
-
-```python
-# remplacement des "nan"
-condition_sets_orig_terms_dedup.loc[condition_sets_orig_terms_dedup['term'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
- comment
-
-
-
-
-
-
-
-
-
-
-```python
-# remplacement des "nan"
-condition_sets_orig_terms_dedup.loc[condition_sets_orig_terms_dedup['condition_type'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
- comment
-
-
-
-
-
-
-
-
-
-
-```python
-# remplacement des "nan"
-condition_sets_orig_terms_dedup.loc[condition_sets_orig_terms_dedup['organization'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
- comment
-
-
-
-
- 0
- 1
- NaN
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
- 1, 2
-
-
-
- 2
- 1
- NaN
- 532
- -3428409893954144223
- 3, 4
-
-
-
- 4
- 1
- NaN
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
- 5362274893926121442
- 5, 6
-
-
-
- 6
- 1
- NaN
- 498
- -713947468848485257
- 7, 8
-
-
-
- 8
- 1
- NaN
- 789
- -5332045039572836456
- 9, 10, 11, 12
-
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1515
- 1
- NaN
- 870
- 3031852869228425137
- 1306, 1307
-
-
-
- 1517
- 1
- NaN
- 41
- -7902056154606509806
- 1308, 1309
-
-
-
- 1519
- 1
- NaN
- 80
- 7657867214417959485
- 1310, 1311
-
-
-
- 1521
- 1
- NaN
- 533
- 7303862352984295282
- 1312, 1313
-
-
-
- 1523
- 1
- NaN
- 608
- 6548018561563906677
- 1314, 1315
-
-
-
-
-
661 rows × 6 columns
-
-
-
-
-
-```python
-# remplacement des "nan"
-condition_sets_orig_terms_dedup['organization'] = condition_sets_orig_terms_dedup['organization'].fillna('')
-condition_sets_orig_terms_dedup
-```
-
- C:\Users\iriarte\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
- A value is trying to be set on a copy of a slice from a DataFrame.
- Try using .loc[row_indexer,col_indexer] = value instead
-
- See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
-
-
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
- comment
-
-
-
-
- 0
- 1
-
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
- 1, 2
-
-
-
- 2
- 1
-
- 532
- -3428409893954144223
- 3, 4
-
-
-
- 4
- 1
-
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
- 5362274893926121442
- 5, 6
-
-
-
- 6
- 1
-
- 498
- -713947468848485257
- 7, 8
-
-
-
- 8
- 1
-
- 789
- -5332045039572836456
- 9, 10, 11, 12
-
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1528
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
- -32115995447722756
- 1318
-
-
-
- 1529
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 4789694892756018439
- 1319
-
-
-
- 1530
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 7722626036678389533
- 1319
-
-
-
- 1531
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 6902392350219571553
- 1320
-
-
-
- 1532
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 4611302665250055299
- 1320
-
-
-
-
-
1149 rows × 6 columns
-
-
-
-
-
-```python
-# convertir l'index en id
-condition_sets_orig_terms_dedup = condition_sets_orig_terms_dedup.reset_index()
-# ajout de l'id avec l'index + 1
-condition_sets_orig_terms_dedup['id'] = condition_sets_orig_terms_dedup['index'] + 1
-del condition_sets_orig_terms_dedup['index']
-condition_sets_orig_terms_dedup
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
- comment
- id
-
-
-
-
- 0
- 1
-
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
- 1, 2
-
- 1
-
-
- 1
- 1
-
- 532
- -3428409893954144223
- 3, 4
-
- 3
-
-
- 2
- 1
-
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
- 5362274893926121442
- 5, 6
-
- 5
-
-
- 3
- 1
-
- 498
- -713947468848485257
- 7, 8
-
- 7
-
-
- 4
- 1
-
- 789
- -5332045039572836456
- 9, 10, 11, 12
-
- 9
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1144
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
- -32115995447722756
- 1318
-
- 1529
-
-
- 1145
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 4789694892756018439
- 1319
-
- 1530
-
-
- 1146
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 7722626036678389533
- 1319
-
- 1531
-
-
- 1147
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 6902392350219571553
- 1320
-
- 1532
-
-
- 1148
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 4611302665250055299
- 1320
-
- 1533
-
-
-
-
1149 rows × 7 columns
-
-
-
-
-
-```python
-# convertir l'index en id
-condition_sets_orig_terms_dedup = condition_sets_orig_terms_dedup.reset_index()
-# ajout de l'id avec l'index + 1
-condition_sets_orig_terms_dedup['id'] = condition_sets_orig_terms_dedup['index'] + 1
-del condition_sets_orig_terms_dedup['index']
-condition_sets_orig_terms_dedup
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
- comment
- id
-
-
-
-
- 0
- 1
-
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
- 1, 2
-
- 1
-
-
- 1
- 1
-
- 532
- -3428409893954144223
- 3, 4
-
- 2
-
-
- 2
- 1
-
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
- 5362274893926121442
- 5, 6
-
- 3
-
-
- 3
- 1
-
- 498
- -713947468848485257
- 7, 8
-
- 4
-
-
- 4
- 1
-
- 789
- -5332045039572836456
- 9, 10, 11, 12
-
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1144
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
- -32115995447722756
- 1318
-
- 1145
-
-
- 1145
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 4789694892756018439
- 1319
-
- 1146
-
-
- 1146
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 7722626036678389533
- 1319
-
- 1147
-
-
- 1147
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 6902392350219571553
- 1320
-
- 1148
-
-
- 1148
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 4611302665250055299
- 1320
-
- 1149
-
-
-
-
1149 rows × 7 columns
-
-
-
-
-
-```python
-# export de la table
-result = condition_sets_orig_terms_dedup[['id', 'condition_type', 'organization', 'journal', 'term', 'comment']].to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/condition_set.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export excel
-condition_sets_orig_terms_dedup[['id', 'condition_type', 'organization', 'journal', 'term', 'comment']].to_excel('sample/condition_set.xlsx', index=False)
-```
-
-
-```python
-# export csv
-condition_sets_orig_terms_dedup[['id', 'condition_type', 'organization', 'journal', 'term', 'comment']].to_csv('sample/condition_set.tsv', index=False)
-```
-
-## Table organization_condition_set
-
-
-```python
-condition_sets_orig_terms_dedup
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
- comment
- id
-
-
-
-
- 0
- 1
-
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
- -5197283134070040275
- 1, 2
-
- 1
-
-
- 1
- 1
-
- 532
- -3428409893954144223
- 3, 4
-
- 2
-
-
- 2
- 1
-
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
- 5362274893926121442
- 5, 6
-
- 3
-
-
- 3
- 1
-
- 498
- -713947468848485257
- 7, 8
-
- 4
-
-
- 4
- 1
-
- 789
- -5332045039572836456
- 9, 10, 11, 12
-
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1144
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
- -32115995447722756
- 1318
-
- 1145
-
-
- 1145
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 4789694892756018439
- 1319
-
- 1146
-
-
- 1146
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 7722626036678389533
- 1319
-
- 1147
-
-
- 1147
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 6902392350219571553
- 1320
-
- 1148
-
-
- 1148
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 4611302665250055299
- 1320
-
- 1149
-
-
-
-
1149 rows × 7 columns
-
-
-
-
-
-```python
-condition_sets_orig_terms_dedup.loc[(condition_sets_orig_terms_dedup['organization'].notna()) & (condition_sets_orig_terms_dedup['organization'] != '')]
-```
-
-
-
-
-
-
-
-
-
-
- condition_type
- organization
- journal
- id_term_hash
- term
- comment
- id
-
-
-
-
- 5
- 3
- 47
- 789
- -6118989085408562349
- 13
-
- 6
-
-
- 11
- 3
- 47
- 668, 576, 371, 410, 849, 184, 670, 559, 58, 16...
- 7026376488862543796
- 22
-
- 12
-
-
- 12
- 1
- 47
- 668, 576, 371, 410, 849, 184, 670, 559, 58, 16...
- 8899497448130036698
- 22
-
- 13
-
-
- 21
- 1
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 985, 485, 787, 415, 189, 395, 652, 83, 227, 44...
- 3530505283797139276
- 42
-
- 22
-
-
- 22
- 3
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 985, 485, 787, 415, 189, 395, 652, 83, 227, 44...
- 3056402465711846666
- 42
-
- 23
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 1144
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
- -32115995447722756
- 1318
-
- 1145
-
-
- 1145
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 4789694892756018439
- 1319
-
- 1146
-
-
- 1146
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 714, 633, 48, 704, 408, 535, 754, 581, 979
- 7722626036678389533
- 1319
-
- 1147
-
-
- 1147
- 3
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 6902392350219571553
- 1320
-
- 1148
-
-
- 1148
- 1
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 866, 171, 186, 839, 592
- 4611302665250055299
- 1320
-
- 1149
-
-
-
-
488 rows × 7 columns
-
-
-
-
-
-```python
-# creation du DF
-# col_names = ['id',
-# 'organization',
-# 'condition_set',
-# 'valid_from',
-# 'valid_until'
-# ]
-# organization_condition = pd.DataFrame(columns = col_names)
-organization_condition = condition_sets_orig_terms_dedup.loc[(condition_sets_orig_terms_dedup['organization'].notna()) & (condition_sets_orig_terms_dedup['organization'] != '')][['id', 'organization', 'term']]
-organization_condition
-```
-
-
-
-
-
-
-
-
-
-
- id
- organization
- term
-
-
-
-
- 5
- 6
- 47
- 13
-
-
- 11
- 12
- 47
- 22
-
-
- 12
- 13
- 47
- 22
-
-
- 21
- 22
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
-
-
- 22
- 23
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
-
-
- ...
- ...
- ...
- ...
-
-
- 1144
- 1145
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1318
-
-
- 1145
- 1146
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
-
-
- 1146
- 1147
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
-
-
- 1147
- 1148
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
-
-
- 1148
- 1149
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
-
-
-
-
488 rows × 3 columns
-
-
-
-
-
-```python
-# extraction des terms ids
-organization_condition_split = organization_condition.assign(term = organization_condition.term.str.split(',')).explode('term')
-organization_condition_split
-```
-
-
-
-
-
-
-
-
-
-
- id
- organization
- term
-
-
-
-
- 5
- 6
- 47
- 13
-
-
- 11
- 12
- 47
- 22
-
-
- 12
- 13
- 47
- 22
-
-
- 21
- 22
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
-
-
- 22
- 23
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
-
-
- ...
- ...
- ...
- ...
-
-
- 1144
- 1145
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1318
-
-
- 1145
- 1146
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
-
-
- 1146
- 1147
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
-
-
- 1147
- 1148
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
-
-
- 1148
- 1149
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
-
-
-
-
490 rows × 3 columns
-
-
-
-
-
-```python
-organization_condition_split.loc[organization_condition_split['organization'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- id
- organization
- term
-
-
-
-
-
-
-
-
-
-
-```python
-organization_condition_split.loc[organization_condition_split['term'].isna()]
-```
-
-
-
-
-
-
-
-
-
-
- id
- organization
- term
-
-
-
-
-
-
-
-
-
-
-```python
-organization_condition_split['term'] = organization_condition_split['term'].astype(int)
-organization_condition_split
-```
-
-
-
-
-
-
-
-
-
-
- id
- organization
- term
-
-
-
-
- 5
- 6
- 47
- 13
-
-
- 11
- 12
- 47
- 22
-
-
- 12
- 13
- 47
- 22
-
-
- 21
- 22
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
-
-
- 22
- 23
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
-
-
- ...
- ...
- ...
- ...
-
-
- 1144
- 1145
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1318
-
-
- 1145
- 1146
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
-
-
- 1146
- 1147
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
-
-
- 1147
- 1148
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
-
-
- 1148
- 1149
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
-
-
-
-
490 rows × 3 columns
-
-
-
-
-
-```python
-# ajout du ROR
-terms_export_dates
-```
-
-
-
-
-
-
-
-
-
-
- id_content_hash
- ror
- valid_from
- valid_until
- term
-
-
-
-
- 0
- -6020029623494903364
- https://ror.org/04d8ztx87
- 2020-01-01
- 2023-12-31
- 1316
-
-
- 1
- -6020029623494903364
- https://ror.org/02bnkt322
- 2020-01-01
- 2023-12-31
- 1316
-
-
- 2
- -6020029623494903364
- https://ror.org/00zg4za48
- 2020-01-01
- 2023-12-31
- 1316
-
-
- 3
- -6020029623494903364
- https://ror.org/02s376052
- 2020-01-01
- 2023-12-31
- 1316
-
-
- 4
- -6020029623494903364
- https://ror.org/05a28rw58
- 2020-01-01
- 2023-12-31
- 1316
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 40078
- 7687377827846095855
- https://ror.org/01swzsf04
- 2021-01-01
- 2023-12-31
- 1320
-
-
- 40079
- 7687377827846095855
- https://ror.org/019whta54
- 2021-01-01
- 2023-12-31
- 1320
-
-
- 40080
- 7687377827846095855
- https://ror.org/00vasag41
- 2021-01-01
- 2023-12-31
- 1320
-
-
- 40081
- 7687377827846095855
- https://ror.org/05r0ap620
- 2021-01-01
- 2023-12-31
- 1320
-
-
- 40082
- 7687377827846095855
- https://ror.org/05pmsvm27
- 2021-01-01
- 2023-12-31
- 1320
-
-
-
-
40083 rows × 5 columns
-
-
-
-
-
-```python
-# merge pour obtenir les dates
-organization_condition_split = pd.merge(organization_condition_split, terms_export_dates[['term', 'valid_from', 'valid_until']], on='term', how='left')
-organization_condition_split
-```
-
-
-
-
-
-
-
-
-
-
- id
- organization
- term
- valid_from
- valid_until
-
-
-
-
- 0
- 6
- 47
- 13
- NaN
- NaN
-
-
- 1
- 12
- 47
- 22
- NaN
- NaN
-
-
- 2
- 13
- 47
- 22
- NaN
- NaN
-
-
- 3
- 22
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
- NaN
- NaN
-
-
- 4
- 23
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 48610
- 1149
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
- 2021-01-01
- 2023-12-31
-
-
- 48611
- 1149
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
- 2021-01-01
- 2023-12-31
-
-
- 48612
- 1149
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
- 2021-01-01
- 2023-12-31
-
-
- 48613
- 1149
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
- 2021-01-01
- 2023-12-31
-
-
- 48614
- 1149
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
- 2021-01-01
- 2023-12-31
-
-
-
-
48615 rows × 5 columns
-
-
-
-
-
-```python
-# dédoublonage
-organization_condition_split_dedup = organization_condition_split.drop_duplicates()
-organization_condition_split_dedup
-```
-
-
-
-
-
-
-
-
-
-
- id
- organization
- term
- valid_from
- valid_until
-
-
-
-
- 0
- 6
- 47
- 13
- NaN
- NaN
-
-
- 1
- 12
- 47
- 22
- NaN
- NaN
-
-
- 2
- 13
- 47
- 22
- NaN
- NaN
-
-
- 3
- 22
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
- NaN
- NaN
-
-
- 4
- 23
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 32042
- 1145
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1318
- 2021-01-01
- 2024-12-31
-
-
- 45947
- 1146
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
- 2021-01-01
- 2023-12-31
-
-
- 46361
- 1147
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
- 2021-01-01
- 2023-12-31
-
-
- 46775
- 1148
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
- 2021-01-01
- 2023-12-31
-
-
- 47695
- 1149
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
- 2021-01-01
- 2023-12-31
-
-
-
-
490 rows × 5 columns
-
-
-
-
-
-```python
-organization_condition = pd.merge(organization_condition, organization_condition_split_dedup[['id', 'valid_from', 'valid_until']], on='id', how='left')
-organization_condition
-```
-
-
-
-
-
-
-
-
-
-
- id
- organization
- term
- valid_from
- valid_until
-
-
-
-
- 0
- 6
- 47
- 13
- NaN
- NaN
-
-
- 1
- 12
- 47
- 22
- NaN
- NaN
-
-
- 2
- 13
- 47
- 22
- NaN
- NaN
-
-
- 3
- 22
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
- NaN
- NaN
-
-
- 4
- 23
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
- NaN
- NaN
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 485
- 1145
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1318
- 2021-01-01
- 2024-12-31
-
-
- 486
- 1146
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
- 2021-01-01
- 2023-12-31
-
-
- 487
- 1147
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
- 2021-01-01
- 2023-12-31
-
-
- 488
- 1148
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
- 2021-01-01
- 2023-12-31
-
-
- 489
- 1149
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
- 2021-01-01
- 2023-12-31
-
-
-
-
490 rows × 5 columns
-
-
-
-
-
-```python
-organization_condition = organization_condition.rename(columns = {'id' : 'condition_set'})
-organization_condition['valid_from'] = organization_condition['valid_from'].fillna('')
-organization_condition['valid_until'] = organization_condition['valid_until'].fillna('')
-organization_condition
-```
-
-
-
-
-
-
-
-
-
-
- condition_set
- organization
- term
- valid_from
- valid_until
-
-
-
-
- 0
- 6
- 47
- 13
-
-
-
-
- 1
- 12
- 47
- 22
-
-
-
-
- 2
- 13
- 47
- 22
-
-
-
-
- 3
- 22
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
-
-
-
-
- 4
- 23
- 48, 64, 51, 74, 68, 67, 69, 59, 75, 76, 77, 78
- 42
-
-
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 485
- 1145
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1318
- 2021-01-01
- 2024-12-31
-
-
- 486
- 1146
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
- 2021-01-01
- 2023-12-31
-
-
- 487
- 1147
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1319
- 2021-01-01
- 2023-12-31
-
-
- 488
- 1148
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
- 2021-01-01
- 2023-12-31
-
-
- 489
- 1149
- 3, 4, 6, 24, 1, 5, 31, 27, 7, 8, 28, 11, 44, 1...
- 1320
- 2021-01-01
- 2023-12-31
-
-
-
-
490 rows × 5 columns
-
-
-
-
-
-```python
-# split final pour avoir une ligne par organization
-organization_condition_fin = organization_condition.assign(organization = organization_condition.organization.str.split(',')).explode('organization')
-organization_condition_fin
-```
-
-
-
-
-
-
-
-
-
-
- condition_set
- organization
- term
- valid_from
- valid_until
-
-
-
-
- 0
- 6
- 47
- 13
-
-
-
-
- 1
- 12
- 47
- 22
-
-
-
-
- 2
- 13
- 47
- 22
-
-
-
-
- 3
- 22
- 48
- 42
-
-
-
-
- 3
- 22
- 64
- 42
-
-
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 489
- 1149
- 2
- 1320
- 2021-01-01
- 2023-12-31
-
-
- 489
- 1149
- 35
- 1320
- 2021-01-01
- 2023-12-31
-
-
- 489
- 1149
- 38
- 1320
- 2021-01-01
- 2023-12-31
-
-
- 489
- 1149
- 46
- 1320
- 2021-01-01
- 2023-12-31
-
-
- 489
- 1149
- 43
- 1320
- 2021-01-01
- 2023-12-31
-
-
-
-
6834 rows × 5 columns
-
-
-
-
-
-```python
-# ajout de l'id avec l'index + 1
-organization_condition_fin = organization_condition_fin.reset_index()
-organization_condition_fin['id'] = organization_condition_fin.index + 1
-del organization_condition_fin['index']
-organization_condition_fin
-```
-
-
-
-
-
-
-
-
-
-
- condition_set
- organization
- term
- valid_from
- valid_until
- id
-
-
-
-
- 0
- 6
- 47
- 13
-
-
- 1
-
-
- 1
- 12
- 47
- 22
-
-
- 2
-
-
- 2
- 13
- 47
- 22
-
-
- 3
-
-
- 3
- 22
- 48
- 42
-
-
- 4
-
-
- 4
- 22
- 64
- 42
-
-
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 6829
- 1149
- 2
- 1320
- 2021-01-01
- 2023-12-31
- 6830
-
-
- 6830
- 1149
- 35
- 1320
- 2021-01-01
- 2023-12-31
- 6831
-
-
- 6831
- 1149
- 38
- 1320
- 2021-01-01
- 2023-12-31
- 6832
-
-
- 6832
- 1149
- 46
- 1320
- 2021-01-01
- 2023-12-31
- 6833
-
-
- 6833
- 1149
- 43
- 1320
- 2021-01-01
- 2023-12-31
- 6834
-
-
-
-
6834 rows × 6 columns
-
-
-
-
-
-```python
-# export de la table
-result = organization_condition_fin[['id', 'condition_set', 'organization', 'valid_from', 'valid_until']].to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/organization_condition.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export excel
-organization_condition_fin[['id', 'condition_set', 'organization', 'valid_from', 'valid_until']].to_excel('sample/organization_condition.xlsx', index=False)
-```
-
-
-```python
-# export csv
-organization_condition_fin[['id', 'condition_set', 'organization', 'valid_from', 'valid_until']].to_csv('sample/organization_condition.tsv', index=False)
-```
-
-## Table journal_condition_set
-
-
-```python
-# creation du DF
-# col_names = ['id',
-# 'journal',
-# 'condition_set',
-# 'valid_from',
-# 'valid_until'
-# ]
-# journal_condition = pd.DataFrame(columns = col_names)
-journal_condition = condition_sets_orig_terms_dedup.loc[(condition_sets_orig_terms_dedup['journal'].notna()) & (condition_sets_orig_terms_dedup['journal'] != '')][['id', 'journal']]
-journal_condition
-```
-
-
-
-
-
-
-
-
-
-
- id
- journal
-
-
-
-
- 0
- 1
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
-
-
- 1
- 2
- 532
-
-
- 2
- 3
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
-
-
- 3
- 4
- 498
-
-
- 4
- 5
- 789
-
-
- ...
- ...
- ...
-
-
- 1144
- 1145
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
-
-
- 1145
- 1146
- 714, 633, 48, 704, 408, 535, 754, 581, 979
-
-
- 1146
- 1147
- 714, 633, 48, 704, 408, 535, 754, 581, 979
-
-
- 1147
- 1148
- 866, 171, 186, 839, 592
-
-
- 1148
- 1149
- 866, 171, 186, 839, 592
-
-
-
-
1149 rows × 2 columns
-
-
-
-
-
-```python
-journal_condition = journal_condition.rename(columns = {'id' : 'condition_set'})
-journal_condition['valid_from'] = ''
-journal_condition['valid_until'] = ''
-journal_condition
-```
-
-
-
-
-
-
-
-
-
-
- condition_set
- journal
- valid_from
- valid_until
-
-
-
-
- 0
- 1
- 532, 482, 452, 663, 323, 674, 317, 154, 439, 5...
-
-
-
-
- 1
- 2
- 532
-
-
-
-
- 2
- 3
- 498, 70, 359, 573, 63, 66, 274, 116, 384, 163,...
-
-
-
-
- 3
- 4
- 498
-
-
-
-
- 4
- 5
- 789
-
-
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1144
- 1145
- 942, 854, 933, 297, 130, 144, 549, 283, 512, 1...
-
-
-
-
- 1145
- 1146
- 714, 633, 48, 704, 408, 535, 754, 581, 979
-
-
-
-
- 1146
- 1147
- 714, 633, 48, 704, 408, 535, 754, 581, 979
-
-
-
-
- 1147
- 1148
- 866, 171, 186, 839, 592
-
-
-
-
- 1148
- 1149
- 866, 171, 186, 839, 592
-
-
-
-
-
-
1149 rows × 4 columns
-
-
-
-
-
-```python
-# split final pour avoir une ligne par journal
-journal_condition_fin = journal_condition.assign(journal = journal_condition.journal.str.split(',')).explode('journal')
-journal_condition_fin
-```
-
-
-
-
-
-
-
-
-
-
- condition_set
- journal
- valid_from
- valid_until
-
-
-
-
- 0
- 1
- 532
-
-
-
-
- 0
- 1
- 482
-
-
-
-
- 0
- 1
- 452
-
-
-
-
- 0
- 1
- 663
-
-
-
-
- 0
- 1
- 323
-
-
-
-
- ...
- ...
- ...
- ...
- ...
-
-
- 1148
- 1149
- 866
-
-
-
-
- 1148
- 1149
- 171
-
-
-
-
- 1148
- 1149
- 186
-
-
-
-
- 1148
- 1149
- 839
-
-
-
-
- 1148
- 1149
- 592
-
-
-
-
-
-
3033 rows × 4 columns
-
-
-
-
-
-```python
-# ajout de l'id avec l'index + 1
-journal_condition_fin = journal_condition_fin.reset_index()
-journal_condition_fin['id'] = journal_condition_fin.index + 1
-del journal_condition_fin['index']
-journal_condition_fin
-```
-
-
-
-
-
-
-
-
-
-
- condition_set
- journal
- valid_from
- valid_until
- id
-
-
-
-
- 0
- 1
- 532
-
-
- 1
-
-
- 1
- 1
- 482
-
-
- 2
-
-
- 2
- 1
- 452
-
-
- 3
-
-
- 3
- 1
- 663
-
-
- 4
-
-
- 4
- 1
- 323
-
-
- 5
-
-
- ...
- ...
- ...
- ...
- ...
- ...
-
-
- 3028
- 1149
- 866
-
-
- 3029
-
-
- 3029
- 1149
- 171
-
-
- 3030
-
-
- 3030
- 1149
- 186
-
-
- 3031
-
-
- 3031
- 1149
- 839
-
-
- 3032
-
-
- 3032
- 1149
- 592
-
-
- 3033
-
-
-
-
3033 rows × 5 columns
-
-
-
-
-
-```python
-# export de la table
-result = journal_condition_fin.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/journal_condition.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-```
-
-
-```python
-# export excel
-journal_condition_fin.to_excel('sample/journal_condition.xlsx', index=False)
-```
-
-
-```python
-# export csv
-journal_condition_fin.to_csv('sample/journal_condition.tsv', index=False)
-```
-
-
-```python
-
-```
diff --git a/import_scripts/10_oacct_terms.py b/import_scripts/10_oacct_terms.py
deleted file mode 100644
index bb7de41f..00000000
--- a/import_scripts/10_oacct_terms.py
+++ /dev/null
@@ -1,1975 +0,0 @@
-#!/usr/bin/env python
-# coding: utf-8
-
-# # Projet Open Access Compliance Check Tool (OACCT)
-#
-# Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-#
-# Ce notebook permet de modifier les données extraites des differentes sources et les exporter dans les tables de l'application OACCT.
-#
-# Auteur : **Pablo Iriarte**, Université de Genève (pablo.iriarte@unige.ch)
-# Date de dernière mise à jour : 08.09.2021
-
-# In[1]:
-
-
-import pandas as pd
-import csv
-import json
-import numpy as np
-import os
-# afficher toutes les colonnes
-pd.set_option('display.max_columns', None)
-# definir le debut des ids
-id_start = 1
-
-
-# ## Import du fichier extrait de Sherpa
-
-# In[2]:
-
-
-sherpa = pd.read_csv('sample/sherpa_policies_brut.tsv', encoding='utf-8', header=0, sep='\t')
-sherpa
-
-
-# In[3]:
-
-
-# test des valeurs pour les versions
-sherpa['article_version'].value_counts()
-
-
-# In[4]:
-
-
-# test des valeurs pour les issns
-sherpa.loc[sherpa['issn'].isna()]
-
-
-# In[5]:
-
-
-# ajout des ISSN-L
-issns = pd.read_csv('issn/20171102.ISSN-to-ISSN-L.txt', encoding='utf-8', header=0, sep='\t')
-issns
-
-
-# In[6]:
-
-
-# renommer les colonnes
-issns = issns.rename(columns={'ISSN' : 'issn', 'ISSN-L' : 'issnl'})
-issns
-
-
-# In[7]:
-
-
-# merge avec la table sherpa
-sherpa = pd.merge(sherpa, issns, on='issn', how='left')
-sherpa
-
-
-# In[8]:
-
-
-# test des valeurs pour les issnl
-sherpa.loc[sherpa['issnl'].isna()]
-
-
-# In[9]:
-
-
-# extraction des données IR Archiving + Embargo par ISSN
-sherpa_ir = sherpa[['issnl', ]]
-
-
-# ## Import du fichier des licences Read & Publish
-
-# In[10]:
-
-
-rp = pd.read_csv('sample/read_publish_brut_merge.tsv', encoding='utf-8', header=0, sep='\t')
-rp
-
-
-# In[11]:
-
-
-rp['embargo_months'].value_counts()
-
-
-# In[12]:
-
-
-# ajout de l'éditeur dans un seul champ
-# rp.loc[rp['Elsevier'] == 'x', 'public_notes'] = 'Elsevier Read & Publish agreement'
-rp.loc[rp['Elsevier'] == 'x', 'rp_publisher'] = 'Elsevier'
-rp.loc[rp['Springer Nature'] == 'x', 'rp_publisher'] = 'Springer Nature'
-rp.loc[rp['Wiley'] == 'x', 'rp_publisher'] = 'Wiley'
-rp.loc[rp['TF'] == 'x', 'rp_publisher'] = 'TF'
-rp.loc[rp['CUP'] == 'x', 'rp_publisher'] = 'CUP'
-rp
-
-
-# In[13]:
-
-
-# test des valeurs pour les versions
-rp['rp_publisher'].value_counts()
-
-
-# In[14]:
-
-
-# test des valeurs pour les versions
-rp['license'].value_counts()
-
-
-# In[15]:
-
-
-# supprimer les champs inutiles et renommer les colonnes
-del rp['Elsevier']
-del rp['Springer Nature']
-del rp['Wiley']
-del rp['TF']
-del rp['CUP']
-del rp['URL']
-rp
-
-
-# In[16]:
-
-
-# renommer les colonnes
-rp = rp.rename(columns = {'Title' : 'title', 'ROR' : 'ror', 'read_publish_id' : 'rp_id'})
-rp
-
-
-# ## Table applicable_version
-
-# In[17]:
-
-
-# creation du DF
-col_names = ['id',
- 'type',
- 'description'
- ]
-applicable_version = pd.DataFrame(columns = col_names)
-# 3 values : published, accepted, submitted
-new_row1 = {'id':1, 'type':'submitted', 'description' : 'Submitted version'}
-new_row2 = {'id':2, 'type':'accepted', 'description' : 'Accepted version'}
-new_row3 = {'id':3, 'type':'published', 'description' : 'Published version'}
-#append row to the dataframe
-applicable_version = applicable_version.append(new_row1, ignore_index=True)
-applicable_version = applicable_version.append(new_row2, ignore_index=True)
-applicable_version = applicable_version.append(new_row3, ignore_index=True)
-applicable_version
-
-
-# In[18]:
-
-
-# ajout de la valeur UNKNOWN
-applicable_version = applicable_version.append({'id' : 999999, 'type' : 'UNKNOWN', 'description' : 'UNKNOWN'}, ignore_index=True)
-applicable_version
-
-
-# In[19]:
-
-
-# renommage des champs finaux
-applicable_version_export = applicable_version[['id', 'description']]
-
-
-# In[20]:
-
-
-# export de la table applicable_version
-result = applicable_version_export.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/version.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[21]:
-
-
-# export csv
-applicable_version_export.to_csv('sample/version.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[22]:
-
-
-# export excel
-applicable_version_export.to_excel('sample/version.xlsx', index=False)
-
-
-# In[23]:
-
-
-# merge avec la table sherpa
-sherpa = pd.merge(sherpa, applicable_version[['id', 'type']], left_on='article_version', right_on='type', how='left')
-sherpa
-
-
-# In[24]:
-
-
-sherpa = sherpa.rename(columns = {'id_x' : 'id', 'id_y' : 'version'})
-del sherpa['type']
-sherpa
-
-
-# In[25]:
-
-
-# merge avec la table read & publish
-rp = pd.merge(rp, applicable_version[['id', 'type']], left_on='article_version', right_on='type', how='left')
-rp
-
-
-# In[26]:
-
-
-rp = rp.rename(columns = {'id' : 'version'})
-del rp['type']
-rp
-
-
-# ## Table oa_licence
-
-# In[27]:
-
-
-# creation du DF
-# 'version' n'est pas utilisée, on dédoublonne par nom sans la version
-col_names = ['id',
- 'name',
- 'url'
- ]
-oa_licence = pd.DataFrame(columns = col_names)
-oa_licence
-
-
-# In[28]:
-
-
-# export des licences
-sherpa['license'].value_counts()
-
-
-# In[29]:
-
-
-sherpa_licences = sherpa['license'].drop_duplicates()
-sherpa_licences = sherpa_licences.dropna()
-sherpa_licences
-
-
-# In[30]:
-
-
-oa_licence['sherpa_code'] = np.nan
-oa_licence
-
-
-# In[31]:
-
-
-for code in sherpa_licences:
- print (code)
- oa_licence = oa_licence.append({'sherpa_code' : code}, ignore_index=True)
-
-
-# In[32]:
-
-
-oa_licence
-
-
-# In[33]:
-
-
-# convertir l'index en id
-oa_licence = oa_licence.reset_index()
-# ajout de l'id avec l'index + 1
-oa_licence['id'] = oa_licence['index'] + 1
-del oa_licence['index']
-oa_licence
-
-
-# In[34]:
-
-
-# ajout du nom et des URLs
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by', 'name'] = 'CC BY'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by', 'url'] = 'https://creativecommons.org/licenses/by/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_sa', 'name'] = 'CC BY-SA'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_sa', 'url'] = 'https://creativecommons.org/licenses/by-sa/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc', 'name'] = 'CC BY-NC'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc', 'url'] = 'https://creativecommons.org/licenses/by-nc/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc_sa', 'name'] = 'CC BY-NC-SA'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc_sa', 'url'] = 'https://creativecommons.org/licenses/by-nc-sa/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nd', 'name'] = 'CC BY-ND'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nd', 'url'] = 'https://creativecommons.org/licenses/by-nd/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc_nd', 'name'] = 'CC BY-NC-ND'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_by_nc_nd', 'url'] = 'https://creativecommons.org/licenses/by-nc-nd/4.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc0', 'name'] = 'CC0'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc0', 'url'] = 'https://creativecommons.org/publicdomain/zero/1.0/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'bespoke_license', 'name'] = 'Specific license'
-oa_licence.loc[oa_licence['sherpa_code'] == 'bespoke_license', 'url'] = ''
-oa_licence.loc[oa_licence['sherpa_code'] == 'all_rights_reserved', 'name'] = 'All rights reserved'
-oa_licence.loc[oa_licence['sherpa_code'] == 'all_rights_reserved', 'url'] = ''
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_gnu_gpl', 'name'] = 'GNU GPL'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_gnu_gpl', 'url'] = 'http://gnugpl.org/'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_public_domain', 'name'] = 'Public domain'
-oa_licence.loc[oa_licence['sherpa_code'] == 'cc_public_domain', 'url'] = 'https://creativecommons.org/share-your-work/public-domain/'
-# oa_licence.loc[oa_licence['sherpa_code'] == 'bespoke_license', 'url'] = 'https://port.sas.ac.uk/mod/book/view.php?id=1340&chapterid=1003'
-oa_licence
-
-
-# In[35]:
-
-
-# ajout de la valeur UNKNOWN
-oa_licence = oa_licence.append({'id' : 999999, 'sherpa_code' : '___', 'name' : 'UNKNOWN', 'url' : ''}, ignore_index=True)
-oa_licence
-
-
-# In[36]:
-
-
-# ajout aux tables sherpa et rp
-sherpa = sherpa.rename(columns = {'license' : 'sherpa_code'})
-sherpa
-
-
-# In[37]:
-
-
-# ajout aux tables sherpa et rp
-rp = rp.rename(columns = {'license' : 'sherpa_code'})
-rp
-
-
-# In[38]:
-
-
-# merge
-sherpa = pd.merge(sherpa, oa_licence[['sherpa_code', 'id']], on='sherpa_code', how='left')
-sherpa
-
-
-# In[39]:
-
-
-sherpa = sherpa.rename(columns = {'id_x' : 'id', 'id_y' : 'licence'})
-sherpa
-
-
-# In[40]:
-
-
-# merge
-rp = pd.merge(rp, oa_licence[['sherpa_code', 'id']], on='sherpa_code', how='left')
-rp
-
-
-# In[41]:
-
-
-rp = rp.rename(columns = {'id' : 'licence'})
-rp
-
-
-# In[42]:
-
-
-# renommage des champs finaux
-oa_licence_export = oa_licence[['id', 'name', 'url']]
-oa_licence_export = oa_licence_export.rename(columns={'name' : 'name_or_abbrev', 'url' : 'website'})
-
-
-# In[43]:
-
-
-# export de la table oa_licence
-result = oa_licence_export.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/licence.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[44]:
-
-
-# export csv
-oa_licence_export.to_csv('sample/licence.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[45]:
-
-
-# export excel
-oa_licence_export.to_excel('sample/licence.xlsx', index=False)
-
-
-# ## Table cost_factor_type
-
-# In[46]:
-
-
-# creation du DF
-col_names = ['id',
- 'name'
- ]
-cost_factor_type = pd.DataFrame(columns = col_names)
-cost_factor_type = cost_factor_type.append({'id' : 1, 'name' : 'APC'}, ignore_index=True)
-cost_factor_type = cost_factor_type.append({'id' : 2, 'name' : 'Discount'}, ignore_index=True)
-cost_factor_type = cost_factor_type.append({'id' : 3, 'name' : 'Refund'}, ignore_index=True)
-cost_factor_type
-
-
-# In[47]:
-
-
-# ajout de la valeur UNKNOWN
-cost_factor_type = cost_factor_type.append({'id' : 999999, 'name' : 'UNKNOWN'}, ignore_index=True)
-cost_factor_type
-
-
-# In[48]:
-
-
-# export de la table
-result = cost_factor_type.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/cost_factor_type.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[49]:
-
-
-# export csv
-cost_factor_type.to_csv('sample/cost_factor_type.tsv', sep='\t', encoding='utf-8', index=False)
-
-
-# In[50]:
-
-
-# export excel
-cost_factor_type.to_excel('sample/cost_factor_type.xlsx', index=False)
-
-
-# ## Table cost_factor
-
-# ### Ajout des données des APCs depuis DOAJ
-
-# In[51]:
-
-
-# ajout de DOAJ info
-doaj = pd.read_csv('doaj/journalcsv__doaj_20210312_0636_utf8.csv', encoding='utf-8', header=0)
-doaj
-
-
-# In[52]:
-
-
-# garder les lignes avec APC
-doaj_apc = doaj.loc[doaj['APC'] == 'Yes'][['Journal ISSN (print version)', 'Journal EISSN (online version)', 'APC amount']]
-doaj_apc
-
-
-# In[53]:
-
-
-# garder les lignes avec APC no
-doaj_apc_no = doaj.loc[doaj['APC'] == 'No'][['Journal ISSN (print version)', 'Journal EISSN (online version)']]
-doaj_apc_no
-
-
-# In[54]:
-
-
-# attribuer la valeur 0
-doaj_apc_no['APC amount'] = 0
-doaj_apc_no
-
-
-# In[55]:
-
-
-# ajout à la table des APC
-doaj_apc = doaj_apc.append(doaj_apc_no, ignore_index=True)
-doaj_apc
-
-
-# In[56]:
-
-
-# découpage du prix en 'amount' et 'symbol'
-doaj_apc[['amount', 'symbol']] = doaj_apc['APC amount'].str.split(' ', n=1, expand=True)
-doaj_apc
-
-
-# In[57]:
-
-
-doaj_apc.loc[doaj_apc['APC amount'] == 0, 'amount'] = 0
-doaj_apc.loc[doaj_apc['APC amount'] == 0, 'symbol'] = ''
-doaj_apc
-
-
-# In[58]:
-
-
-# ajouter les champs manquants
-doaj_apc['cost_factor_type'] = 1
-doaj_apc['comment'] = 'Source: DOAJ'
-doaj_apc
-
-
-# In[59]:
-
-
-# renommer les champs
-doaj_apc = doaj_apc.rename(columns = {'Journal ISSN (print version)' : 'issn_print', 'Journal EISSN (online version)' : 'issn_electronic'})
-doaj_apc
-
-
-# In[60]:
-
-
-# ajout du issn
-doaj_apc['issn'] = doaj_apc['issn_electronic']
-doaj_apc
-
-
-# In[61]:
-
-
-doaj_apc.loc[doaj_apc['issn'].isna()]
-
-
-# In[62]:
-
-
-# ajout du issnp quand c'est vide
-doaj_apc.loc[doaj_apc['issn'].isna(), 'issn'] = doaj_apc['issn_print']
-doaj_apc.loc[doaj_apc['issn'].isna()]
-
-
-# In[63]:
-
-
-doaj_apc = pd.merge(doaj_apc, issns, on='issn', how='left')
-doaj_apc
-
-
-# In[64]:
-
-
-# renommer les colonnes
-doaj_apc = doaj_apc.rename(columns={'issnl' : 'issn_link'})
-doaj_apc
-
-
-# ### Ajout des APCs depuis la base Journal Database (Zurich Open Repository and Archive)
-#
-# https://www.jdb.uzh.ch/
-
-# In[65]:
-
-
-# JDB base de Zurich
-jdb = pd.read_csv('zora/jdb_apcs.tsv', encoding='utf-8', header=0, sep='\t')
-jdb
-
-
-# In[66]:
-
-
-# renommer l'id
-jdb = jdb.rename(columns = {'id' : 'jdb_id'})
-jdb
-
-
-# In[67]:
-
-
-# ajouter les champs manquants
-jdb['cost_factor_type'] = 1
-jdb['comment'] = 'Source: JDB (' + jdb['apc_date'].astype(str) + ')'
-jdb
-
-
-# In[68]:
-
-
-# renommer les champs
-jdb = jdb.rename(columns = {'apc_fee' : 'amount', 'apc_currency' : 'symbol'})
-jdb
-
-
-# In[69]:
-
-
-jdb = jdb.drop_duplicates(subset='jdb_id', keep='last')
-
-
-# In[70]:
-
-
-# import openapc avec les valeurs max
-openapc = pd.read_csv('openapc/open_apc_max.tsv', encoding='utf-8', header=0, sep='\t')
-openapc
-
-
-# In[71]:
-
-
-# renommer les champs
-openapc = openapc.rename(columns = {'period' : 'apc_date', 'issn_l' : 'issn_link', 'euro' : 'amount'})
-openapc
-
-
-# In[72]:
-
-
-# ajouter le lien avec le type et le symbole
-openapc['cost_factor_type'] = 1
-openapc['jdb_id'] = np.nan
-openapc['symbol'] = 'EUR'
-openapc['comment'] = 'Source: OpenAPC (' + openapc['apc_date'].astype(str) + ')'
-openapc
-
-
-# In[73]:
-
-
-# ajout des lignes de openapc
-jdb = jdb.append(openapc, ignore_index=True)
-jdb
-
-
-# In[74]:
-
-
-# supprimer les doublons par issnl et date
-jdb = jdb.drop_duplicates(subset=['issn_link', 'apc_date'], keep='first')
-jdb
-
-
-# In[75]:
-
-
-# ajout de DOAJ
-cost_factor = doaj_apc.append(jdb, ignore_index=True)
-cost_factor
-
-
-# In[76]:
-
-
-# test issnl
-cost_factor.loc[cost_factor['issn_link'].isna()]
-
-
-# In[77]:
-
-
-# merge avec issnl
-cost_factor = pd.merge(cost_factor, issns, on='issn', how='left')
-cost_factor
-
-
-# In[78]:
-
-
-# test issnl
-cost_factor.loc[cost_factor['issnl'].isna()]
-
-
-# In[79]:
-
-
-#ajout des issn quand ça manque
-cost_factor.loc[cost_factor['issn'].isna(), 'issn'] = cost_factor['issn_print']
-cost_factor.loc[cost_factor['issn'].isna(), 'issn'] = cost_factor['issn_electronic']
-cost_factor.loc[cost_factor['issn'].isna(), 'issn'] = cost_factor['issn_link']
-cost_factor.loc[cost_factor['issn'].isna()]
-
-
-# In[80]:
-
-
-#ajout des issnl quand ça manque
-cost_factor.loc[cost_factor['issnl'].isna(), 'issnl'] = cost_factor['issn_link']
-cost_factor.loc[cost_factor['issnl'].isna(), 'issnl'] = cost_factor['issn_print']
-cost_factor.loc[cost_factor['issnl'].isna(), 'issnl'] = cost_factor['issn_electronic']
-cost_factor.loc[cost_factor['issnl'].isna(), 'issnl'] = cost_factor['issn']
-cost_factor.loc[cost_factor['issnl'].isna()]
-
-
-# In[81]:
-
-
-# prendre les ids pour le merge
-cost_factor_ids = cost_factor[['issn', 'issnl', 'cost_factor_type', 'amount', 'symbol', 'comment']]
-# cost_factor_ids_1 = cost_factor_ids_1.rename(columns = {'issn_link' : 'issn'})
-# cost_factor_ids_2 = cost_factor.loc[cost_factor['issn_electronic'].notna()][['issn_electronic', 'cost_factor_type', 'amount', 'symbol', 'comment']]
-# cost_factor_ids_2 = cost_factor_ids_2.rename(columns = {'issn_electronic' : 'issn'})
-# cost_factor_ids_3 = cost_factor.loc[cost_factor['issn_print'].notna()][['issn_print', 'cost_factor_type', 'amount', 'symbol', 'comment']]
-# cost_factor_ids_3 = cost_factor_ids_3.rename(columns = {'issn_print' : 'issn'})
-# cost_factor_ids_4 = cost_factor.loc[cost_factor['issn'].notna()][['issn', 'cost_factor_type', 'amount', 'symbol', 'comment']]
-# cost_factor_ids = cost_factor_ids_1.append(cost_factor_ids_2)
-# cost_factor_ids = cost_factor_ids.append(cost_factor_ids_3)
-# cost_factor_ids = cost_factor_ids.append(cost_factor_ids_4)
-cost_factor_ids
-
-
-# In[82]:
-
-
-# supprimer les doublons et les vides
-cost_factor_ids = cost_factor_ids.drop_duplicates(subset=['issnl'])
-cost_factor_ids
-
-
-# In[83]:
-
-
-# merge dans l'autre sens pour garder que les lignes du fichier
-cost_factor_ids = pd.merge(cost_factor_ids, sherpa[['id', 'issnl']], on='issnl', how='left')
-cost_factor_ids
-
-
-# In[84]:
-
-
-# garder les lignes avec merge
-cost_factor_ids_all = cost_factor_ids.loc[cost_factor_ids['id'].notnull()]
-cost_factor_ids_all
-
-
-# In[85]:
-
-
-# supprimer les doublons
-cost_factor_ids_all = cost_factor_ids_all.drop_duplicates(subset=['id'])
-cost_factor_ids_all
-
-
-# In[86]:
-
-
-# supprimer les doublons par issnl
-cost_factor_ids_all = cost_factor_ids_all.drop_duplicates(subset=['issnl'])
-del cost_factor_ids_all['id']
-cost_factor_ids_all
-
-
-# In[87]:
-
-
-# convertir l'index en id
-cost_factor_ids_all = cost_factor_ids_all.reset_index()
-# ajout de l'id avec l'index + 1
-cost_factor_ids_all['cost_factor'] = cost_factor_ids_all['index'] + id_start
-del cost_factor_ids_all['index']
-# convertir l'index en id
-cost_factor_ids_all = cost_factor_ids_all.reset_index()
-# ajout de l'id avec l'index + 1
-cost_factor_ids_all['cost_factor'] = cost_factor_ids_all['index'] + id_start
-del cost_factor_ids_all['index']
-cost_factor_ids_all
-
-
-# In[88]:
-
-
-# merge avec la table sherpa
-sherpa = pd.merge(sherpa, cost_factor_ids_all[['issnl', 'cost_factor']], on='issnl', how='left')
-sherpa
-
-
-# In[89]:
-
-
-sherpa.loc[sherpa['cost_factor'].isna()]
-
-
-# In[90]:
-
-
-# garder les APCs pour la version published
-sherpa.loc[sherpa['article_version'] != 'published', 'cost_factor'] = np.nan
-sherpa.loc[sherpa['cost_factor'].notna()]
-
-
-# In[91]:
-
-
-# renommer l'id du fichier sherpa brut
-# cost_factor_ids_all = cost_factor_ids_all.rename(columns = {'id' : 'id_sherpa'})
-cost_factor_ids_all = cost_factor_ids_all.rename(columns = {'cost_factor' : 'id'})
-cost_factor_ids_all
-
-
-# In[92]:
-
-
-cost_factor_ids_all['id'] = cost_factor_ids_all['id'].astype(int)
-
-
-# In[93]:
-
-
-cost_factor_ids_all
-
-
-# In[94]:
-
-
-cost_factor_export = cost_factor_ids_all[['id', 'cost_factor_type', 'amount', 'symbol', 'comment']]
-cost_factor_export
-
-
-# In[95]:
-
-
-cost_factor_export.shape[0]
-
-
-# In[96]:
-
-
-# ajout de la valeur Rabais 100% pour les licences Read & Publish
-rpid = cost_factor_export.shape[0] + 1
-cost_factor_export = cost_factor_export.append({'id' : rpid, 'cost_factor_type' : 2, 'amount' : 100, 'symbol' : '%', 'comment' : 'Read & Publish agreement'}, ignore_index=True)
-cost_factor_export
-
-
-# In[97]:
-
-
-# ajout de l'id dans la table read & publish
-rp['cost_factor'] = rpid
-rp
-
-
-# In[98]:
-
-
-# ajout de la valeur UNKNOWN
-cost_factor_export = cost_factor_export.append({'id' : 999999, 'cost_factor_type' : 999999, 'amount' : 0, 'symbol' : '', 'comment' : 'UNKNOWN'}, ignore_index=True)
-cost_factor_export
-
-
-# In[99]:
-
-
-# export de la table
-result = cost_factor_export.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/cost_factor.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[100]:
-
-
-# export csv
-cost_factor_export.to_csv('sample/cost_factor.tsv', index=False)
-
-
-# In[101]:
-
-
-# export excel
-cost_factor_export.to_excel('sample/cost_factor.xlsx', index=False)
-
-
-# ## Table term
-
-# In[102]:
-
-
-sherpa
-
-
-# In[103]:
-
-
-# col_names = ['id', 'applicable_version', 'cost_factor', 'embargo', 'archiving']
-term_sherpa = sherpa[['id', 'version', 'cost_factor', 'embargo', 'archiving', 'locations_ir', 'locations_not_ir', 'licence', 'journal', 'conditions', 'public_notes', 'prerequisite_funders', 'prerequisite_funders_ror']]
-term_sherpa
-
-
-# In[104]:
-
-
-# renommer les champs
-term_sherpa = term_sherpa.rename(columns = {'id' : 'id_sherpa', 'embargo' : 'embargo_months', 'prerequisite_funders_ror' : 'ror'})
-term_sherpa
-
-
-# In[105]:
-
-
-# merge des champs dans le comment : conditions, public_notes, locations_not_ir
-term_sherpa['conditions'] = term_sherpa['conditions'].fillna('')
-term_sherpa['public_notes'] = term_sherpa['public_notes'].fillna('')
-term_sherpa['locations_not_ir'] = term_sherpa['locations_not_ir'].fillna('')
-term_sherpa['locations_ir'] = term_sherpa['locations_ir'].fillna('')
-term_sherpa.loc[term_sherpa['locations_not_ir'] != '', 'locations_not_ir'] = 'Non institutional archiving locations: ' + term_sherpa['locations_not_ir']
-term_sherpa.loc[term_sherpa['locations_ir'] != '', 'locations_ir'] = 'Institutional archiving locations: ' + term_sherpa['locations_ir']
-term_sherpa.loc[term_sherpa['archiving'] == False, 'comment'] = term_sherpa['locations_not_ir']
-term_sherpa.loc[term_sherpa['archiving'] == True, 'comment'] = term_sherpa['locations_ir']
-term_sherpa.loc[term_sherpa['comment'] == '', 'comment'] = 'Conditions: ' + term_sherpa['conditions']
-term_sherpa.loc[(term_sherpa['comment'] != '') & (term_sherpa['conditions'] != ''), 'comment'] = term_sherpa['comment'] + ' ; Conditions: ' + term_sherpa['conditions']
-term_sherpa.loc[(term_sherpa['public_notes'] != '') & (term_sherpa['public_notes'] != term_sherpa['comment']), 'comment'] = term_sherpa['comment'] + ' ; Public notes: ' + term_sherpa['public_notes']
-term_sherpa.loc[(term_sherpa['public_notes'] != '') & (term_sherpa['comment'] == ''), 'comment'] = 'Public notes: ' + term_sherpa['public_notes']
-term_sherpa
-
-
-# In[106]:
-
-
-term_sherpa['prerequisite_funders'].value_counts()
-
-
-# In[107]:
-
-
-rp
-
-
-# In[108]:
-
-
-term_rp = rp[['rp_id', 'version', 'archiving', 'embargo_months', 'cost_factor', 'licence', 'journal', 'rp_publisher', 'ror', 'valid_from', 'valid_until']]
-term_rp
-
-
-# In[109]:
-
-
-term_rp['rp_publisher'].value_counts()
-
-
-# In[110]:
-
-
-term_rp.loc[term_rp['rp_publisher'] == 'Elsevier', 'comment'] = 'Elsevier Read & Publish agreement'
-term_rp.loc[term_rp['rp_publisher'] == 'Wiley', 'comment'] = 'Wiley Read & Publish agreement'
-term_rp.loc[term_rp['rp_publisher'] == 'TF', 'comment'] = 'Taylor and Francis Read & Publish agreement'
-term_rp.loc[term_rp['rp_publisher'] == 'Springer Nature ', 'comment'] = 'Springer Nature Read & Publish agreement'
-term_rp.loc[term_rp['rp_publisher'] == 'CUP', 'comment'] = 'Cambridge University Press (CUP) Read & Publish agreement. Article types covered: Research Articles, Review Articles, Rapid Communication, Brief Reports and Case Reports'
-del term_rp['rp_publisher']
-term_rp
-
-
-# In[111]:
-
-
-# cocnat de deux tables
-term_orig = term_sherpa[['id_sherpa', 'version', 'cost_factor', 'embargo_months', 'archiving', 'licence', 'journal', 'prerequisite_funders', 'ror', 'comment']]
-term_orig
-
-
-# In[112]:
-
-
-term_orig = term_orig.append(term_rp, ignore_index=True, sort=False)
-term_orig
-
-
-# In[113]:
-
-
-# ajout d'un hash unique pour chaque variante
-term_orig['id_content_hash'] = term_orig.apply(lambda x: hash(tuple(x[['version', 'cost_factor', 'embargo_months', 'archiving', 'comment']])), axis = 1)
-term_orig['id_content_hash_licence'] = term_orig.apply(lambda x: hash(tuple(x[['version', 'cost_factor', 'embargo_months', 'archiving', 'licence', 'comment']])), axis = 1)
-
-
-# In[114]:
-
-
-term_orig.sort_values(by='id_content_hash')
-
-
-# In[115]:
-
-
-# doublons
-term_orig.loc[term_orig.duplicated(subset='id_content_hash')].sort_values(by='id_content_hash')
-
-
-# In[116]:
-
-
-term_orig['licence'] = term_orig['licence'].fillna(999999)
-term_orig['licence'] = term_orig['licence'].astype(int)
-term_orig['cost_factor'] = term_orig['cost_factor'].fillna(999999)
-term_orig['cost_factor'] = term_orig['cost_factor'].astype(int)
-# term_orig['embargo_months'] = term_orig['embargo_months'].fillna(0)
-# term_orig['embargo_months'] = term_orig['embargo_months'].astype(int)
-term_orig.loc[term_orig['archiving'] == True, 'ir_archiving'] = 1
-term_orig.loc[term_orig['archiving'] == False, 'ir_archiving'] = 0
-term_orig['ir_archiving'] = term_orig['ir_archiving'].fillna(0)
-term_orig
-
-
-# In[117]:
-
-
-term_orig.loc[term_orig['ir_archiving'].isna()]
-
-
-# In[118]:
-
-
-term_orig['ir_archiving'].value_counts()
-
-
-# In[119]:
-
-
-term_orig['licence'] = term_orig['licence'].astype(int)
-term_orig['ir_archiving'] = term_orig['ir_archiving'].astype(int)
-term_orig['cost_factor'] = term_orig['cost_factor'].astype(int)
-term_orig
-
-
-# In[120]:
-
-
-terms_export_dates = term_orig.loc[(term_orig['valid_from'].notna()) | (term_orig['valid_until'].notna())][['id_content_hash', 'ror', 'valid_from', 'valid_until']]
-terms_export_dates
-
-
-# In[121]:
-
-
-terms_export = term_orig[['id_sherpa', 'rp_id', 'id_content_hash', 'id_content_hash_licence', 'version', 'cost_factor', 'embargo_months', 'ir_archiving', 'licence', 'comment']]
-terms_export
-
-
-# In[122]:
-
-
-# test de doublons
-terms_export.loc[terms_export.duplicated(subset='id_content_hash')].sort_values(by='id_content_hash')
-
-
-# In[123]:
-
-
-terms_export_dedup = terms_export.drop_duplicates(subset=['id_content_hash'])
-terms_export_dedup
-
-
-# In[124]:
-
-
-terms_export_dedup_licence = terms_export.drop_duplicates(subset=['id_content_hash_licence'])
-terms_export_dedup_licence
-
-
-# In[125]:
-
-
-# test de doublons
-terms_export_dedup_licence.loc[terms_export_dedup_licence.duplicated(subset='id_content_hash')].sort_values(by='id_content_hash')
-
-
-# In[126]:
-
-
-# totaux pour les deux sources
-terms_export_dedup.loc[terms_export_dedup['id_sherpa'].notna()].shape[0]
-
-
-# In[127]:
-
-
-terms_export_dedup.loc[terms_export_dedup['rp_id'].notna()].shape[0]
-
-
-# In[128]:
-
-
-terms_export_dedup.loc[terms_export_dedup['rp_id'].notna()]
-
-
-# In[129]:
-
-
-# convertir l'index en id
-terms_export_dedup.reset_index(inplace=True)
-del terms_export_dedup['index']
-terms_export_dedup
-
-
-# In[130]:
-
-
-# ajout de l'id avec l'index + 1
-terms_export_dedup['id'] = terms_export_dedup.index + 1
-# del terms_export_dedup['index']
-terms_export_dedup
-
-
-# In[131]:
-
-
-terms_export_dedup['source'] = ''
-terms_export_dedup
-
-
-# In[132]:
-
-
-# grouper par licence
-terms_export_dedup_licences = terms_export_dedup_licence[['licence', 'id_content_hash']]
-terms_export_dedup_licences
-
-
-# In[133]:
-
-
-# concat valeurs avec même id
-terms_export_dedup_licences['licence'] = terms_export_dedup_licences['licence'].astype(str)
-terms_export_dedup_licences = terms_export_dedup_licences.groupby('id_content_hash').agg({'licence': lambda x: ', '.join(x)})
-terms_export_dedup_licences
-
-
-# In[134]:
-
-
-# test des valeur multiples
-terms_export_dedup_licences.loc[terms_export_dedup_licences['licence'].str.contains(',')]
-
-
-# In[135]:
-
-
-# ajout des licences groupées
-terms_export_dedup_fin = pd.merge(terms_export_dedup, terms_export_dedup_licences, on='id_content_hash', how='left')
-terms_export_dedup_fin
-
-
-# In[136]:
-
-
-# merge avec les dates pour avoir les terms ids
-terms_export_dates = pd.merge(terms_export_dates, terms_export_dedup_fin[['id_content_hash', 'id']], on='id_content_hash')
-terms_export_dates = terms_export_dates.rename(columns = {'id' : 'term'})
-terms_export_dates
-
-
-# In[137]:
-
-
-# renommer les champs de licence
-del terms_export_dedup_fin['licence_x']
-terms_export_dedup_fin = terms_export_dedup_fin.rename(columns = {'licence_y' : 'licence'})
-
-
-# In[138]:
-
-
-terms_export_fin = terms_export_dedup_fin[['version', 'cost_factor', 'embargo_months', 'ir_archiving', 'licence', 'comment', 'id', 'source']]
-terms_export_fin
-
-
-# In[139]:
-
-
-# export de la table
-result = terms_export_fin.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/term.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[140]:
-
-
-# export csv
-terms_export_fin.to_csv('sample/term.tsv', index=False)
-
-
-# In[141]:
-
-
-# export excel
-terms_export_fin.to_excel('sample/term.xlsx', index=False)
-
-
-# ## Table condition_type
-
-# In[142]:
-
-
-# Journal-only, Organization-only, Journal-organization agreement
-col_names = ['id',
- 'condition_issuer'
- ]
-condition_type = pd.DataFrame(columns = col_names)
-condition_type = condition_type.append({'id' : 1, 'condition_issuer' : 'Journal-only'}, ignore_index=True)
-condition_type = condition_type.append({'id' : 2, 'condition_issuer' : 'Organization-only'}, ignore_index=True)
-condition_type = condition_type.append({'id' : 3, 'condition_issuer' : 'Journal-organization agreement'}, ignore_index=True)
-condition_type
-
-
-# In[143]:
-
-
-# export de la table
-result = condition_type.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/condition_type.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[144]:
-
-
-# export csv
-condition_type.to_csv('sample/condition_type.tsv', index=False)
-
-
-# In[145]:
-
-
-# export excel
-condition_type.to_excel('sample/condition_type.xlsx', index=False)
-
-
-# ## Table organization
-
-# In[146]:
-
-
-# extraction des organizations (funders)
-sherpa
-
-
-# In[147]:
-
-
-sherpa.loc[sherpa['prerequisite_funders'].notna()]
-
-
-# In[148]:
-
-
-sherpa['prerequisite_funders'].value_counts()
-
-
-# In[149]:
-
-
-funders = sherpa.loc[sherpa['prerequisite_funders'].notna()][['prerequisite_funders_name', 'prerequisite_funders_fundref', 'prerequisite_funders_ror', 'prerequisite_funders_country', 'prerequisite_funders_url', 'prerequisite_funders_sherpa_id']]
-funders
-
-
-# In[150]:
-
-
-funders_dedup = funders.drop_duplicates(subset='prerequisite_funders_ror')
-funders_dedup
-
-
-# In[151]:
-
-
-funders_dedup.shape[0]
-
-
-# In[152]:
-
-
-# export excel
-funders_dedup.to_excel('sample/funders.xlsx', index=False)
-
-
-# In[153]:
-
-
-# export csv
-funders_dedup.to_csv('sample/funders.tsv', index=False)
-
-
-# In[154]:
-
-
-# creation du DF
-organization_funders = funders_dedup
-organization_funders = organization_funders.rename(columns = {'prerequisite_funders_name' : 'name',
- 'prerequisite_funders_fundref' : 'fundref',
- 'prerequisite_funders_ror' : 'ror',
- 'prerequisite_funders_country' : 'iso_code',
- 'prerequisite_funders_url' : 'website',
- 'prerequisite_funders_sherpa_id' : 'sherpa_id'
- })
-organization_funders
-
-
-# In[155]:
-
-
-# lien avec les pays
-country = pd.read_csv('sample/country.tsv', encoding='utf-8', header=0, sep='\t')
-country
-
-
-# In[156]:
-
-
-# merge avec les pays
-organization_funders['iso_code'] = organization_funders['iso_code'].str.upper()
-organization_funders['is_funder'] = 1
-organization_funders = pd.merge(organization_funders, country[['iso_code', 'id']], how='left', on='iso_code')
-organization_funders
-
-
-# In[157]:
-
-
-organization_funders = organization_funders.rename(columns = {'id' : 'country'})
-organization_funders
-
-
-# In[158]:
-
-
-# ajout des organizations suisses
-organization = pd.read_csv('ror/ror_ch_hei_export.tsv', encoding='utf-8', header=0, sep='\t', dtype={'fundref': str, 'orgref': str}, na_filter=False)
-organization
-
-
-# In[159]:
-
-
-# tri par nom
-organization = organization.sort_values(by='name')
-organization
-
-
-# In[160]:
-
-
-organization = organization.reset_index(drop=True)
-organization
-
-
-# In[161]:
-
-
-# mettre l'EPFL en position 1 et UNIGE en 2
-target_row = 32
-# Move target row to first element of list.
-idx = [target_row] + [i for i in range(len(organization)) if i != target_row]
-organization = organization.iloc[idx]
-organization
-
-
-# In[162]:
-
-
-organization = organization.reset_index(drop=True)
-organization
-
-
-# In[163]:
-
-
-# mettre l'EPFL en position 1 et UNIGE en 2
-target_row = 45
-# Move target row to first element of list.
-idx = [target_row] + [i for i in range(len(organization)) if i != target_row]
-organization = organization.iloc[idx]
-organization
-
-
-# In[164]:
-
-
-organization = organization.reset_index(drop=True)
-organization
-
-
-# In[165]:
-
-
-# ajout des funders
-organization = organization.append(organization_funders, ignore_index=True)
-organization
-
-
-# In[166]:
-
-
-# remplacement dans le fundref id qui renvoie vers du JSON seulement
-# URL actuel : http://data.crossref.org/fundingdata/funder/10.13039/[fundref id]
-# ex : http://dx.doi.org/10.13039/501100007903
-# redirigé sur : http://data.crossref.org/fundingdata/funder/10.13039/501100007903
-# URL des publications financées : https://search.crossref.org/funding?q=[fundref id]&from_ui=yes
-# ex : https://search.crossref.org/funding?q=501100003006&from_ui=yes
-organization['fundref'] = organization['fundref'].str.replace('http://dx.doi.org/10.13039/', '')
-organization
-
-
-# In[167]:
-
-
-# df pour l'export
-organization_export = organization[['name', 'website', 'country', 'starting_year', 'is_funder', 'ror', 'fundref']]
-organization_export
-
-
-# In[168]:
-
-
-# ajout des valeurs vides
-organization_export['starting_year'] = organization_export['starting_year'].fillna(0)
-organization_export['fundref'] = organization_export['fundref'].fillna('')
-organization_export['ror'] = organization_export['ror'].fillna('')
-organization_export
-
-
-# In[169]:
-
-
-# ajout de l'id avec l'index + 1
-organization_export['id'] = organization_export.index + 1
-# del terms_export_dedup['index']
-organization_export
-
-
-# In[170]:
-
-
-# export de la table
-result = organization_export.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/organization.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[171]:
-
-
-# export excel
-organization_export.to_excel('sample/organization.xlsx', index=False)
-
-
-# In[172]:
-
-
-# export csv
-organization_export.to_csv('sample/organization.tsv', index=False)
-
-
-# ## Table condition_set_term
-
-# In[173]:
-
-
-term_orig
-
-
-# In[174]:
-
-
-terms_export_dedup
-
-
-# In[175]:
-
-
-# merge des terms id
-term_orig = pd.merge(term_orig, terms_export_dedup[['id_content_hash', 'id']], on='id_content_hash', how='left')
-term_orig
-
-
-# In[176]:
-
-
-term_orig = term_orig.rename(columns = {'id' : 'term'})
-term_orig
-
-
-# In[177]:
-
-
-condition_type
-
-
-# In[178]:
-
-
-# merge des condition type
-term_orig['condition_type'] = 3
-term_orig.loc[term_orig['ror'].isna(), 'condition_type'] = 1
-term_orig
-
-
-# In[179]:
-
-
-organization_export
-
-
-# In[180]:
-
-
-# merge des organizations
-term_orig = pd.merge(term_orig, organization_export[['ror', 'id']], on='ror', how='left')
-term_orig
-
-
-# In[181]:
-
-
-term_orig = term_orig.rename(columns = {'id' : 'organization'})
-term_orig
-
-
-# In[182]:
-
-
-# concat valeurs avec même id
-condition_set_term_dedup_terms = term_orig[['term', 'id_content_hash']]
-condition_set_term_dedup_terms_dedup = condition_set_term_dedup_terms.drop_duplicates()
-condition_set_term_dedup_terms_dedup = condition_set_term_dedup_terms_dedup.loc[condition_set_term_dedup_terms_dedup['term'].notna()]
-condition_set_term_dedup_terms_dedup['term'] = condition_set_term_dedup_terms_dedup['term'].astype(int)
-condition_set_term_dedup_terms_dedup['term'] = condition_set_term_dedup_terms_dedup['term'].astype(str)
-condition_set_term_dedup_terms_dedup = condition_set_term_dedup_terms_dedup.groupby('id_content_hash').agg({'term': lambda x: ', '.join(x)})
-condition_set_term_dedup_terms_dedup
-
-
-# In[183]:
-
-
-# concat valeurs avec même id
-condition_set_term_dedup_journals = term_orig[['journal', 'id_content_hash']]
-condition_set_term_dedup_journals_dedup = condition_set_term_dedup_journals.drop_duplicates()
-condition_set_term_dedup_journals_dedup = condition_set_term_dedup_journals_dedup.loc[condition_set_term_dedup_journals_dedup['journal'].notna()]
-condition_set_term_dedup_journals_dedup['journal'] = condition_set_term_dedup_journals_dedup['journal'].astype(int)
-condition_set_term_dedup_journals_dedup['journal'] = condition_set_term_dedup_journals_dedup['journal'].astype(str)
-condition_set_term_dedup_journals_dedup = condition_set_term_dedup_journals_dedup.groupby('id_content_hash').agg({'journal': lambda x: ', '.join(x)})
-condition_set_term_dedup_journals_dedup
-
-
-# In[184]:
-
-
-# concat valeurs avec même id
-condition_set_term_dedup_organizations = term_orig[['organization', 'id_content_hash']]
-condition_set_term_dedup_organizations_dedup = condition_set_term_dedup_organizations.drop_duplicates()
-condition_set_term_dedup_organizations_dedup = condition_set_term_dedup_organizations_dedup.loc[condition_set_term_dedup_organizations_dedup['organization'].notna()]
-condition_set_term_dedup_organizations_dedup['organization'] = condition_set_term_dedup_organizations_dedup['organization'].astype(int)
-condition_set_term_dedup_organizations_dedup['organization'] = condition_set_term_dedup_organizations_dedup['organization'].astype(str)
-condition_set_term_dedup_organizations_dedup = condition_set_term_dedup_organizations_dedup.groupby('id_content_hash').agg({'organization': lambda x: ', '.join(x)})
-condition_set_term_dedup_organizations_dedup
-
-
-# In[185]:
-
-
-# concat valeurs avec même id : pas possible pour condition_type
-condition_set_term_dedup_condition_types = term_orig[['condition_type', 'id_content_hash']]
-condition_set_term_dedup_condition_types_dedup = condition_set_term_dedup_condition_types.drop_duplicates()
-condition_set_term_dedup_condition_types_dedup = condition_set_term_dedup_condition_types_dedup.loc[condition_set_term_dedup_condition_types_dedup['condition_type'].notna()]
-# condition_set_term_dedup_condition_types_dedup['condition_type'] = condition_set_term_dedup_condition_types_dedup['condition_type'].astype(int)
-# condition_set_term_dedup_condition_types_dedup['condition_type'] = condition_set_term_dedup_condition_types_dedup['condition_type'].astype(str)
-# condition_set_term_dedup_condition_types_dedup = condition_set_term_dedup_condition_types_dedup.groupby('id_content_hash').agg({'condition_type': lambda x: ', '.join(x)})
-condition_set_term_dedup_condition_types_dedup
-
-
-# In[186]:
-
-
-# recuperation des ids groupés
-terms_export_dedup = pd.merge(terms_export_dedup, condition_set_term_dedup_terms_dedup, on='id_content_hash', how='left')
-terms_export_dedup = pd.merge(terms_export_dedup, condition_set_term_dedup_journals_dedup, on='id_content_hash', how='left')
-terms_export_dedup = pd.merge(terms_export_dedup, condition_set_term_dedup_organizations_dedup, on='id_content_hash', how='left')
-terms_export_dedup = pd.merge(terms_export_dedup, condition_set_term_dedup_condition_types_dedup, on='id_content_hash', how='left')
-terms_export_dedup
-
-
-# In[187]:
-
-
-condition_sets_orig = terms_export_dedup[['term', 'condition_type', 'organization', 'journal']]
-condition_sets_orig
-
-
-# In[188]:
-
-
-# ajout d'un hash unique pour chaque variante
-condition_sets_orig['id_term_hash'] = condition_sets_orig.apply(lambda x: hash(tuple(x[['condition_type', 'organization', 'journal']])), axis = 1)
-condition_sets_orig
-
-
-# In[189]:
-
-
-# grouper les termes qui ont les mêmes valeurs pour le reste
-condition_sets_orig_terms = condition_sets_orig[['term', 'id_term_hash']]
-condition_sets_orig_terms_dedup = condition_sets_orig_terms.drop_duplicates()
-condition_sets_orig_terms_dedup = condition_sets_orig_terms_dedup.loc[condition_sets_orig_terms_dedup['term'].notna()]
-condition_sets_orig_terms_dedup['term'] = condition_sets_orig_terms_dedup['term'].astype(int)
-condition_sets_orig_terms_dedup['term'] = condition_sets_orig_terms_dedup['term'].astype(str)
-condition_sets_orig_terms_dedup = condition_sets_orig_terms_dedup.groupby('id_term_hash').agg({'term': lambda x: ', '.join(x)})
-condition_sets_orig_terms_dedup
-
-
-# In[190]:
-
-
-# ajout des ids groupées
-condition_sets_orig_terms = pd.merge(condition_sets_orig, condition_sets_orig_terms_dedup, on='id_term_hash', how='left')
-condition_sets_orig_terms
-
-
-# In[191]:
-
-
-# rename terms
-del condition_sets_orig_terms['term_x']
-condition_sets_orig_terms = condition_sets_orig_terms.rename(columns = {'term_y' : 'term'})
-condition_sets_orig_terms
-
-
-# In[192]:
-
-
-# test duplicates
-condition_sets_orig_terms.loc[condition_sets_orig_terms.duplicated()].sort_values(by='term')
-
-
-# In[193]:
-
-
-condition_sets_orig_terms.loc[condition_sets_orig_terms.duplicated()].shape[0]
-
-
-# In[194]:
-
-
-condition_sets_orig_terms_dedup = condition_sets_orig_terms.drop_duplicates()
-condition_sets_orig_terms_dedup
-
-
-# In[195]:
-
-
-# ajout des champs manquants
-condition_sets_orig_terms_dedup['comment'] = ''
-
-
-# In[196]:
-
-
-# remplacement des "nan"
-condition_sets_orig_terms_dedup.loc[condition_sets_orig_terms_dedup['journal'].isna()]
-
-
-# In[197]:
-
-
-# remplacement des "nan"
-condition_sets_orig_terms_dedup.loc[condition_sets_orig_terms_dedup['term'].isna()]
-
-
-# In[198]:
-
-
-# remplacement des "nan"
-condition_sets_orig_terms_dedup.loc[condition_sets_orig_terms_dedup['condition_type'].isna()]
-
-
-# In[199]:
-
-
-# remplacement des "nan"
-condition_sets_orig_terms_dedup.loc[condition_sets_orig_terms_dedup['organization'].isna()]
-
-
-# In[200]:
-
-
-# remplacement des "nan"
-condition_sets_orig_terms_dedup['organization'] = condition_sets_orig_terms_dedup['organization'].fillna('')
-condition_sets_orig_terms_dedup
-
-
-# In[201]:
-
-
-# convertir l'index en id
-condition_sets_orig_terms_dedup = condition_sets_orig_terms_dedup.reset_index()
-# ajout de l'id avec l'index + 1
-condition_sets_orig_terms_dedup['id'] = condition_sets_orig_terms_dedup['index'] + 1
-del condition_sets_orig_terms_dedup['index']
-condition_sets_orig_terms_dedup
-
-
-# In[202]:
-
-
-# convertir l'index en id
-condition_sets_orig_terms_dedup = condition_sets_orig_terms_dedup.reset_index()
-# ajout de l'id avec l'index + 1
-condition_sets_orig_terms_dedup['id'] = condition_sets_orig_terms_dedup['index'] + 1
-del condition_sets_orig_terms_dedup['index']
-condition_sets_orig_terms_dedup
-
-
-# In[203]:
-
-
-# export de la table
-result = condition_sets_orig_terms_dedup[['id', 'condition_type', 'organization', 'journal', 'term', 'comment']].to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/condition_set.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[204]:
-
-
-# export excel
-condition_sets_orig_terms_dedup[['id', 'condition_type', 'organization', 'journal', 'term', 'comment']].to_excel('sample/condition_set.xlsx', index=False)
-
-
-# In[205]:
-
-
-# export csv
-condition_sets_orig_terms_dedup[['id', 'condition_type', 'organization', 'journal', 'term', 'comment']].to_csv('sample/condition_set.tsv', index=False)
-
-
-# ## Table organization_condition_set
-
-# In[206]:
-
-
-condition_sets_orig_terms_dedup
-
-
-# In[207]:
-
-
-condition_sets_orig_terms_dedup.loc[(condition_sets_orig_terms_dedup['organization'].notna()) & (condition_sets_orig_terms_dedup['organization'] != '')]
-
-
-# In[208]:
-
-
-# creation du DF
-# col_names = ['id',
-# 'organization',
-# 'condition_set',
-# 'valid_from',
-# 'valid_until'
-# ]
-# organization_condition = pd.DataFrame(columns = col_names)
-organization_condition = condition_sets_orig_terms_dedup.loc[(condition_sets_orig_terms_dedup['organization'].notna()) & (condition_sets_orig_terms_dedup['organization'] != '')][['id', 'organization', 'term']]
-organization_condition
-
-
-# In[209]:
-
-
-# extraction des terms ids
-organization_condition_split = organization_condition.assign(term = organization_condition.term.str.split(',')).explode('term')
-organization_condition_split
-
-
-# In[210]:
-
-
-organization_condition_split.loc[organization_condition_split['organization'].isna()]
-
-
-# In[211]:
-
-
-organization_condition_split.loc[organization_condition_split['term'].isna()]
-
-
-# In[212]:
-
-
-organization_condition_split['term'] = organization_condition_split['term'].astype(int)
-organization_condition_split
-
-
-# In[213]:
-
-
-# ajout du ROR
-terms_export_dates
-
-
-# In[214]:
-
-
-# merge pour obtenir les dates
-organization_condition_split = pd.merge(organization_condition_split, terms_export_dates[['term', 'valid_from', 'valid_until']], on='term', how='left')
-organization_condition_split
-
-
-# In[215]:
-
-
-# dédoublonage
-organization_condition_split_dedup = organization_condition_split.drop_duplicates()
-organization_condition_split_dedup
-
-
-# In[216]:
-
-
-organization_condition = pd.merge(organization_condition, organization_condition_split_dedup[['id', 'valid_from', 'valid_until']], on='id', how='left')
-organization_condition
-
-
-# In[217]:
-
-
-organization_condition = organization_condition.rename(columns = {'id' : 'condition_set'})
-organization_condition['valid_from'] = organization_condition['valid_from'].fillna('')
-organization_condition['valid_until'] = organization_condition['valid_until'].fillna('')
-organization_condition
-
-
-# In[218]:
-
-
-# split final pour avoir une ligne par organization
-organization_condition_fin = organization_condition.assign(organization = organization_condition.organization.str.split(',')).explode('organization')
-organization_condition_fin
-
-
-# In[219]:
-
-
-# ajout de l'id avec l'index + 1
-organization_condition_fin = organization_condition_fin.reset_index()
-organization_condition_fin['id'] = organization_condition_fin.index + 1
-del organization_condition_fin['index']
-organization_condition_fin
-
-
-# In[220]:
-
-
-# export de la table
-result = organization_condition_fin[['id', 'condition_set', 'organization', 'valid_from', 'valid_until']].to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/organization_condition.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[221]:
-
-
-# export excel
-organization_condition_fin[['id', 'condition_set', 'organization', 'valid_from', 'valid_until']].to_excel('sample/organization_condition.xlsx', index=False)
-
-
-# In[222]:
-
-
-# export csv
-organization_condition_fin[['id', 'condition_set', 'organization', 'valid_from', 'valid_until']].to_csv('sample/organization_condition.tsv', index=False)
-
-
-# ## Table journal_condition_set
-
-# In[223]:
-
-
-# creation du DF
-# col_names = ['id',
-# 'journal',
-# 'condition_set',
-# 'valid_from',
-# 'valid_until'
-# ]
-# journal_condition = pd.DataFrame(columns = col_names)
-journal_condition = condition_sets_orig_terms_dedup.loc[(condition_sets_orig_terms_dedup['journal'].notna()) & (condition_sets_orig_terms_dedup['journal'] != '')][['id', 'journal']]
-journal_condition
-
-
-# In[224]:
-
-
-journal_condition = journal_condition.rename(columns = {'id' : 'condition_set'})
-journal_condition['valid_from'] = ''
-journal_condition['valid_until'] = ''
-journal_condition
-
-
-# In[225]:
-
-
-# split final pour avoir une ligne par journal
-journal_condition_fin = journal_condition.assign(journal = journal_condition.journal.str.split(',')).explode('journal')
-journal_condition_fin
-
-
-# In[226]:
-
-
-# ajout de l'id avec l'index + 1
-journal_condition_fin = journal_condition_fin.reset_index()
-journal_condition_fin['id'] = journal_condition_fin.index + 1
-del journal_condition_fin['index']
-journal_condition_fin
-
-
-# In[227]:
-
-
-# export de la table
-result = journal_condition_fin.to_json(orient='records', force_ascii=False)
-parsed = json.loads(result)
-with open('sample/journal_condition.json', 'w', encoding='utf-8') as file:
- json.dump(parsed, file, indent=2, ensure_ascii=False)
-
-
-# In[228]:
-
-
-# export excel
-journal_condition_fin.to_excel('sample/journal_condition.xlsx', index=False)
-
-
-# In[229]:
-
-
-# export csv
-journal_condition_fin.to_csv('sample/journal_condition.tsv', index=False)
-
-
-# In[ ]:
-
-
-
-
diff --git a/import_scripts/99_oacct_import.md b/import_scripts/99_oacct_import.md
deleted file mode 100644
index 75df7e4f..00000000
--- a/import_scripts/99_oacct_import.md
+++ /dev/null
@@ -1,212 +0,0 @@
-# Projet Open Access Compliance Check Tool (OACCT)
-
-Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-
-Ce notebook permet d'importer les données en utilisant l'API :
-
-https://oacct-test.epfl.ch/api/
-
-Exemple avec Journals :
-
-https://oacct-test.epfl.ch/api/journal/
-
-GET /api/journal/
-
-HTTP 200 OK
-Allow: GET, POST, HEAD, OPTIONS
-Content-Type: application/json
-Vary: Accept
-
-[]
-
-Media type: application/json
-
-Content:
-``` json
-{
- "issn": [],
- "name": "",
- "name_short_iso_4": "",
- "website": "",
- "oa_options": "",
- "starting_year": null,
- "end_year": null,
- "doaj_seal": false,
- "doaj_status": false,
- "lockss": false,
- "nlch": false,
- "portico": false,
- "qoam_av_score": null
-}
-```
-
-
-
-```python
-import json
-import requests
-import codecs
-oacct_login = 'oacct_test'
-oacct_pwd = '2f4dBRhyj7'
-headers = {'accept': 'application/json'}
-```
-
-
-```python
-# test sans authentifications
-url = 'https://oacct-test.epfl.ch/api/country/'
-r = requests.get(url)
-print(r)
-```
-
-
-
-
-
-```python
-print(r.text)
-```
-
- [{"id":1,"name":"Afghanistan","iso_code":"AF"},{"id":249,"name":"Åland Islands","iso_code":"AX"},{"id":2,"name":"Albania","iso_code":"AL"},{"id":3,"name":"Algeria","iso_code":"DZ"},{"id":4,"name":"American Samoa","iso_code":"AS"},{"id":5,"name":"Andorra","iso_code":"AD"},{"id":6,"name":"Angola","iso_code":"AO"},{"id":7,"name":"Anguilla","iso_code":"AI"},{"id":8,"name":"Antarctica","iso_code":"AQ"},{"id":9,"name":"Antigua and Barbuda","iso_code":"AG"},{"id":10,"name":"Argentina","iso_code":"AR"},{"id":11,"name":"Armenia","iso_code":"AM"},{"id":12,"name":"Aruba","iso_code":"AW"},{"id":13,"name":"Australia","iso_code":"AU"},{"id":14,"name":"Austria","iso_code":"AT"},{"id":15,"name":"Azerbaijan","iso_code":"AZ"},{"id":16,"name":"Bahamas (the)","iso_code":"BS"},{"id":17,"name":"Bahrain","iso_code":"BH"},{"id":18,"name":"Bangladesh","iso_code":"BD"},{"id":19,"name":"Barbados","iso_code":"BB"},{"id":20,"name":"Belarus","iso_code":"BY"},{"id":21,"name":"Belgium","iso_code":"BE"},{"id":22,"name":"Belize","iso_code":"BZ"},{"id":23,"name":"Benin","iso_code":"BJ"},{"id":24,"name":"Bermuda","iso_code":"BM"},{"id":25,"name":"Bhutan","iso_code":"BT"},{"id":26,"name":"Bolivia (Plurinational State of)","iso_code":"BO"},{"id":27,"name":"Bonaire, Sint Eustatius and Saba","iso_code":"BQ"},{"id":28,"name":"Bosnia and Herzegovina","iso_code":"BA"},{"id":29,"name":"Botswana","iso_code":"BW"},{"id":30,"name":"Bouvet Island","iso_code":"BV"},{"id":31,"name":"Brazil","iso_code":"BR"},{"id":32,"name":"British Indian Ocean Territory (the)","iso_code":"IO"},{"id":33,"name":"Brunei Darussalam","iso_code":"BN"},{"id":34,"name":"Bulgaria","iso_code":"BG"},{"id":35,"name":"Burkina Faso","iso_code":"BF"},{"id":36,"name":"Burundi","iso_code":"BI"},{"id":37,"name":"Cabo Verde","iso_code":"CV"},{"id":38,"name":"Cambodia","iso_code":"KH"},{"id":39,"name":"Cameroon","iso_code":"CM"},{"id":40,"name":"Canada","iso_code":"CA"},{"id":41,"name":"Cayman Islands (the)","iso_code":"KY"},{"id":42,"name":"Central African Republic (the)","iso_code":"CF"},{"id":43,"name":"Chad","iso_code":"TD"},{"id":44,"name":"Chile","iso_code":"CL"},{"id":45,"name":"China","iso_code":"CN"},{"id":46,"name":"Christmas Island","iso_code":"CX"},{"id":47,"name":"Cocos (Keeling) Islands (the)","iso_code":"CC"},{"id":48,"name":"Colombia","iso_code":"CO"},{"id":49,"name":"Comoros (the)","iso_code":"KM"},{"id":50,"name":"Congo (the Democratic Republic of the)","iso_code":"CD"},{"id":51,"name":"Congo (the)","iso_code":"CG"},{"id":52,"name":"Cook Islands (the)","iso_code":"CK"},{"id":53,"name":"Costa Rica","iso_code":"CR"},{"id":59,"name":"Côte d'Ivoire","iso_code":"CI"},{"id":54,"name":"Croatia","iso_code":"HR"},{"id":55,"name":"Cuba","iso_code":"CU"},{"id":56,"name":"Curaçao","iso_code":"CW"},{"id":57,"name":"Cyprus","iso_code":"CY"},{"id":58,"name":"Czechia","iso_code":"CZ"},{"id":60,"name":"Denmark","iso_code":"DK"},{"id":61,"name":"Djibouti","iso_code":"DJ"},{"id":62,"name":"Dominica","iso_code":"DM"},{"id":63,"name":"Dominican Republic (the)","iso_code":"DO"},{"id":64,"name":"Ecuador","iso_code":"EC"},{"id":65,"name":"Egypt","iso_code":"EG"},{"id":66,"name":"El Salvador","iso_code":"SV"},{"id":67,"name":"Equatorial Guinea","iso_code":"GQ"},{"id":68,"name":"Eritrea","iso_code":"ER"},{"id":69,"name":"Estonia","iso_code":"EE"},{"id":70,"name":"Eswatini","iso_code":"SZ"},{"id":71,"name":"Ethiopia","iso_code":"ET"},{"id":72,"name":"Falkland Islands (the) [Malvinas]","iso_code":"FK"},{"id":73,"name":"Faroe Islands (the)","iso_code":"FO"},{"id":74,"name":"Fiji","iso_code":"FJ"},{"id":75,"name":"Finland","iso_code":"FI"},{"id":76,"name":"France","iso_code":"FR"},{"id":77,"name":"French Guiana","iso_code":"GF"},{"id":78,"name":"French Polynesia","iso_code":"PF"},{"id":79,"name":"French Southern Territories (the)","iso_code":"TF"},{"id":80,"name":"Gabon","iso_code":"GA"},{"id":81,"name":"Gambia (the)","iso_code":"GM"},{"id":82,"name":"Georgia","iso_code":"GE"},{"id":83,"name":"Germany","iso_code":"DE"},{"id":84,"name":"Ghana","iso_code":"GH"},{"id":85,"name":"Gibraltar","iso_code":"GI"},{"id":86,"name":"Greece","iso_code":"GR"},{"id":87,"name":"Greenland","iso_code":"GL"},{"id":88,"name":"Grenada","iso_code":"GD"},{"id":89,"name":"Guadeloupe","iso_code":"GP"},{"id":90,"name":"Guam","iso_code":"GU"},{"id":91,"name":"Guatemala","iso_code":"GT"},{"id":92,"name":"Guernsey","iso_code":"GG"},{"id":93,"name":"Guinea","iso_code":"GN"},{"id":94,"name":"Guinea-Bissau","iso_code":"GW"},{"id":95,"name":"Guyana","iso_code":"GY"},{"id":96,"name":"Haiti","iso_code":"HT"},{"id":97,"name":"Heard Island and McDonald Islands","iso_code":"HM"},{"id":98,"name":"Holy See (the)","iso_code":"VA"},{"id":99,"name":"Honduras","iso_code":"HN"},{"id":100,"name":"Hong Kong","iso_code":"HK"},{"id":101,"name":"Hungary","iso_code":"HU"},{"id":102,"name":"Iceland","iso_code":"IS"},{"id":103,"name":"India","iso_code":"IN"},{"id":104,"name":"Indonesia","iso_code":"ID"},{"id":250,"name":"International Agency","iso_code":"OI"},{"id":105,"name":"Iran (Islamic Republic of)","iso_code":"IR"},{"id":106,"name":"Iraq","iso_code":"IQ"},{"id":107,"name":"Ireland","iso_code":"IE"},{"id":108,"name":"Isle of Man","iso_code":"IM"},{"id":109,"name":"Israel","iso_code":"IL"},{"id":110,"name":"Italy","iso_code":"IT"},{"id":111,"name":"Jamaica","iso_code":"JM"},{"id":112,"name":"Japan","iso_code":"JP"},{"id":113,"name":"Jersey","iso_code":"JE"},{"id":114,"name":"Jordan","iso_code":"JO"},{"id":115,"name":"Kazakhstan","iso_code":"KZ"},{"id":116,"name":"Kenya","iso_code":"KE"},{"id":117,"name":"Kiribati","iso_code":"KI"},{"id":118,"name":"Korea (the Democratic People's Republic of)","iso_code":"KP"},{"id":119,"name":"Korea (the Republic of)","iso_code":"KR"},{"id":120,"name":"Kuwait","iso_code":"KW"},{"id":121,"name":"Kyrgyzstan","iso_code":"KG"},{"id":122,"name":"Lao People's Democratic Republic (the)","iso_code":"LA"},{"id":123,"name":"Latvia","iso_code":"LV"},{"id":124,"name":"Lebanon","iso_code":"LB"},{"id":125,"name":"Lesotho","iso_code":"LS"},{"id":126,"name":"Liberia","iso_code":"LR"},{"id":127,"name":"Libya","iso_code":"LY"},{"id":128,"name":"Liechtenstein","iso_code":"LI"},{"id":129,"name":"Lithuania","iso_code":"LT"},{"id":130,"name":"Luxembourg","iso_code":"LU"},{"id":131,"name":"Macao","iso_code":"MO"},{"id":132,"name":"Madagascar","iso_code":"MG"},{"id":133,"name":"Malawi","iso_code":"MW"},{"id":134,"name":"Malaysia","iso_code":"MY"},{"id":135,"name":"Maldives","iso_code":"MV"},{"id":136,"name":"Mali","iso_code":"ML"},{"id":137,"name":"Malta","iso_code":"MT"},{"id":138,"name":"Marshall Islands (the)","iso_code":"MH"},{"id":139,"name":"Martinique","iso_code":"MQ"},{"id":140,"name":"Mauritania","iso_code":"MR"},{"id":141,"name":"Mauritius","iso_code":"MU"},{"id":142,"name":"Mayotte","iso_code":"YT"},{"id":143,"name":"Mexico","iso_code":"MX"},{"id":144,"name":"Micronesia (Federated States of)","iso_code":"FM"},{"id":145,"name":"Moldova (the Republic of)","iso_code":"MD"},{"id":146,"name":"Monaco","iso_code":"MC"},{"id":147,"name":"Mongolia","iso_code":"MN"},{"id":148,"name":"Montenegro","iso_code":"ME"},{"id":149,"name":"Montserrat","iso_code":"MS"},{"id":150,"name":"Morocco","iso_code":"MA"},{"id":151,"name":"Mozambique","iso_code":"MZ"},{"id":152,"name":"Myanmar","iso_code":"MM"},{"id":153,"name":"Namibia","iso_code":"NA"},{"id":154,"name":"Nauru","iso_code":"NR"},{"id":155,"name":"Nepal","iso_code":"NP"},{"id":156,"name":"Netherlands (the)","iso_code":"NL"},{"id":157,"name":"New Caledonia","iso_code":"NC"},{"id":158,"name":"New Zealand","iso_code":"NZ"},{"id":159,"name":"Nicaragua","iso_code":"NI"},{"id":160,"name":"Niger (the)","iso_code":"NE"},{"id":161,"name":"Nigeria","iso_code":"NG"},{"id":162,"name":"Niue","iso_code":"NU"},{"id":163,"name":"Norfolk Island","iso_code":"NF"},{"id":164,"name":"North Macedonia","iso_code":"MK"},{"id":165,"name":"Northern Mariana Islands (the)","iso_code":"MP"},{"id":166,"name":"Norway","iso_code":"NO"},{"id":167,"name":"Oman","iso_code":"OM"},{"id":168,"name":"Pakistan","iso_code":"PK"},{"id":169,"name":"Palau","iso_code":"PW"},{"id":170,"name":"Palestine, State of","iso_code":"PS"},{"id":171,"name":"Panama","iso_code":"PA"},{"id":172,"name":"Papua New Guinea","iso_code":"PG"},{"id":173,"name":"Paraguay","iso_code":"PY"},{"id":174,"name":"Peru","iso_code":"PE"},{"id":175,"name":"Philippines (the)","iso_code":"PH"},{"id":176,"name":"Pitcairn","iso_code":"PN"},{"id":177,"name":"Poland","iso_code":"PL"},{"id":178,"name":"Portugal","iso_code":"PT"},{"id":179,"name":"Puerto Rico","iso_code":"PR"},{"id":180,"name":"Qatar","iso_code":"QA"},{"id":184,"name":"Réunion","iso_code":"RE"},{"id":181,"name":"Romania","iso_code":"RO"},{"id":182,"name":"Russian Federation (the)","iso_code":"RU"},{"id":183,"name":"Rwanda","iso_code":"RW"},{"id":185,"name":"Saint Barthélemy","iso_code":"BL"},{"id":186,"name":"Saint Helena, Ascension and Tristan da Cunha","iso_code":"SH"},{"id":187,"name":"Saint Kitts and Nevis","iso_code":"KN"},{"id":188,"name":"Saint Lucia","iso_code":"LC"},{"id":189,"name":"Saint Martin (French part)","iso_code":"MF"},{"id":190,"name":"Saint Pierre and Miquelon","iso_code":"PM"},{"id":191,"name":"Saint Vincent and the Grenadines","iso_code":"VC"},{"id":192,"name":"Samoa","iso_code":"WS"},{"id":193,"name":"San Marino","iso_code":"SM"},{"id":194,"name":"Sao Tome and Principe","iso_code":"ST"},{"id":195,"name":"Saudi Arabia","iso_code":"SA"},{"id":196,"name":"Senegal","iso_code":"SN"},{"id":197,"name":"Serbia","iso_code":"RS"},{"id":198,"name":"Seychelles","iso_code":"SC"},{"id":199,"name":"Sierra Leone","iso_code":"SL"},{"id":1000000,"name":"Sildavie2","iso_code":"II"},{"id":200,"name":"Singapore","iso_code":"SG"},{"id":201,"name":"Sint Maarten (Dutch part)","iso_code":"SX"},{"id":202,"name":"Slovakia","iso_code":"SK"},{"id":203,"name":"Slovenia","iso_code":"SI"},{"id":204,"name":"Solomon Islands","iso_code":"SB"},{"id":205,"name":"Somalia","iso_code":"SO"},{"id":206,"name":"South Africa","iso_code":"ZA"},{"id":207,"name":"South Georgia and the South Sandwich Islands","iso_code":"GS"},{"id":208,"name":"South Sudan","iso_code":"SS"},{"id":209,"name":"Spain","iso_code":"ES"},{"id":210,"name":"Sri Lanka","iso_code":"LK"},{"id":211,"name":"Sudan (the)","iso_code":"SD"},{"id":212,"name":"Suriname","iso_code":"SR"},{"id":213,"name":"Svalbard and Jan Mayen","iso_code":"SJ"},{"id":214,"name":"Sweden","iso_code":"SE"},{"id":215,"name":"Switzerland","iso_code":"CH"},{"id":216,"name":"Syrian Arab Republic (the)","iso_code":"SY"},{"id":217,"name":"Taiwan (Province of China)","iso_code":"TW"},{"id":218,"name":"Tajikistan","iso_code":"TJ"},{"id":219,"name":"Tanzania, the United Republic of","iso_code":"TZ"},{"id":220,"name":"Thailand","iso_code":"TH"},{"id":221,"name":"Timor-Leste","iso_code":"TL"},{"id":222,"name":"Togo","iso_code":"TG"},{"id":223,"name":"Tokelau","iso_code":"TK"},{"id":224,"name":"Tonga","iso_code":"TO"},{"id":225,"name":"Trinidad and Tobago","iso_code":"TT"},{"id":226,"name":"Tunisia","iso_code":"TN"},{"id":227,"name":"Turkey","iso_code":"TR"},{"id":228,"name":"Turkmenistan","iso_code":"TM"},{"id":229,"name":"Turks and Caicos Islands (the)","iso_code":"TC"},{"id":230,"name":"Tuvalu","iso_code":"TV"},{"id":231,"name":"Uganda","iso_code":"UG"},{"id":232,"name":"Ukraine","iso_code":"UA"},{"id":233,"name":"United Arab Emirates (the)","iso_code":"AE"},{"id":234,"name":"United Kingdom of Great Britain and Northern Ireland (the)","iso_code":"GB"},{"id":235,"name":"United States Minor Outlying Islands (the)","iso_code":"UM"},{"id":236,"name":"United States of America (the)","iso_code":"US"},{"id":999999,"name":"UNKNOWN","iso_code":"__"},{"id":237,"name":"Uruguay","iso_code":"UY"},{"id":238,"name":"Uzbekistan","iso_code":"UZ"},{"id":239,"name":"Vanuatu","iso_code":"VU"},{"id":240,"name":"Venezuela (Bolivarian Republic of)","iso_code":"VE"},{"id":241,"name":"Viet Nam","iso_code":"VN"},{"id":242,"name":"Virgin Islands (British)","iso_code":"VG"},{"id":243,"name":"Virgin Islands (U.S.)","iso_code":"VI"},{"id":244,"name":"Wallis and Futuna","iso_code":"WF"},{"id":245,"name":"Western Sahara*","iso_code":"EH"},{"id":246,"name":"Yemen","iso_code":"YE"},{"id":247,"name":"Zambia","iso_code":"ZM"},{"id":248,"name":"Zimbabwe","iso_code":"ZW"}]
-
-
-
-```python
-# test avec authentification
-url = 'https://oacct-test.epfl.ch/api/country/3'
-r2 = requests.get(url, auth=(oacct_login, oacct_pwd))
-print(r2)
-```
-
-
-
-
-
-```python
-print(r2.text)
-```
-
- {"id":3,"name":"Algeria","iso_code":"DZ"}
-
-
-
-```python
-journal = {
- "id": 1,
- "name": "Revue médicale suisse",
- "name_short_iso_4": "Rev. méd. suisse",
- "starting_year": "2005",
- "end_year": "9999",
- "website": "",
- "country": 215.0,
- "language": "138",
- "publisher": "1",
- "doaj_seal": 0,
- "doaj_status": 0,
- "lockss": 0,
- "portico": 0,
- "nlch": 0,
- "qoam_av_score": "",
- "oa_status": 1,
- "issn": "1234-5678"
- }
-```
-
-
-```python
-# test avec post
-url = 'https://oacct-test.epfl.ch/api/journal/'
-r2 = requests.post(url, auth=(oacct_login, oacct_pwd), headers=headers, data=journal)
-print(r2)
-```
-
-
-
-
-
-```python
-print(r2.text)
-```
-
- {"issn":["This field is required."]}
-
-
-
-```python
-country = {
- "name": "Sildavie",
- "iso_code": "II",
- "id": 333
- }
-```
-
-
-```python
-# test avec post
-url = 'https://oacct-test.epfl.ch/api/country/'
-r2 = requests.post(url, auth=(oacct_login, oacct_pwd), headers=headers, data=country)
-print(r2)
-```
-
-
-
-
-
-```python
-print(r2.json())
-```
-
- {'id': 1000001, 'name': 'Sildavie', 'iso_code': 'II'}
-
-
-
-```python
-country2 = {
- "id": 1000000,
- "name": "Sildavie3",
- "iso_code": "II"
-}
-```
-
-
-```python
-# test avec put
-url = 'https://oacct-test.epfl.ch/api/country/1000000'
-r2 = requests.put(url, auth=(oacct_login, oacct_pwd), headers=headers, data=country2)
-print(r2)
-```
-
-
-
-
-
-```python
-print(r2.json())
-```
-
- {'id': 1000000, 'name': 'Sildavie2', 'iso_code': 'II'}
-
-
-
-```python
-# convert to json
-json_response = r2.json()
-print(json_response)
-```
-
- {'id': 1000000, 'name': 'Sildavie2', 'iso_code': 'II'}
-
-
-
-```python
-# get the name
-name = json_response['name']
-name
-```
-
-
-
-
- 'Sildavie2'
-
-
diff --git a/import_scripts/99_oacct_import.py b/import_scripts/99_oacct_import.py
deleted file mode 100644
index 8e2588c1..00000000
--- a/import_scripts/99_oacct_import.py
+++ /dev/null
@@ -1,191 +0,0 @@
-#!/usr/bin/env python
-# coding: utf-8
-
-# # Projet Open Access Compliance Check Tool (OACCT)
-#
-# Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1
-#
-# Ce notebook permet d'importer les données en utilisant l'API :
-#
-# https://oacct-test.epfl.ch/api/
-#
-# Exemple avec Journals :
-#
-# https://oacct-test.epfl.ch/api/journal/
-#
-# GET /api/journal/
-#
-# HTTP 200 OK
-# Allow: GET, POST, HEAD, OPTIONS
-# Content-Type: application/json
-# Vary: Accept
-#
-# []
-#
-# Media type: application/json
-#
-# Content:
-# ``` json
-# {
-# "issn": [],
-# "name": "",
-# "name_short_iso_4": "",
-# "website": "",
-# "oa_options": "",
-# "starting_year": null,
-# "end_year": null,
-# "doaj_seal": false,
-# "doaj_status": false,
-# "lockss": false,
-# "nlch": false,
-# "portico": false,
-# "qoam_av_score": null
-# }
-# ```
-#
-
-# In[1]:
-
-
-import json
-import requests
-import codecs
-oacct_login = 'oacct_test'
-oacct_pwd = '2f4dBRhyj7'
-headers = {'accept': 'application/json'}
-
-
-# In[2]:
-
-
-# test sans authentifications
-url = 'https://oacct-test.epfl.ch/api/country/'
-r = requests.get(url)
-print(r)
-
-
-# In[3]:
-
-
-print(r.text)
-
-
-# In[6]:
-
-
-# test avec authentification
-url = 'https://oacct-test.epfl.ch/api/country/3'
-r2 = requests.get(url, auth=(oacct_login, oacct_pwd))
-print(r2)
-
-
-# In[7]:
-
-
-print(r2.text)
-
-
-# In[9]:
-
-
-journal = {
- "id": 1,
- "name": "Revue médicale suisse",
- "name_short_iso_4": "Rev. méd. suisse",
- "starting_year": "2005",
- "end_year": "9999",
- "website": "",
- "country": 215.0,
- "language": "138",
- "publisher": "1",
- "doaj_seal": 0,
- "doaj_status": 0,
- "lockss": 0,
- "portico": 0,
- "nlch": 0,
- "qoam_av_score": "",
- "oa_status": 1,
- "issn": "1234-5678"
- }
-
-
-# In[11]:
-
-
-# test avec post
-url = 'https://oacct-test.epfl.ch/api/journal/'
-r2 = requests.post(url, auth=(oacct_login, oacct_pwd), headers=headers, data=journal)
-print(r2)
-
-
-# In[12]:
-
-
-print(r2.text)
-
-
-# In[13]:
-
-
-country = {
- "name": "Sildavie",
- "iso_code": "II",
- "id": 333
- }
-
-
-# In[14]:
-
-
-# test avec post
-url = 'https://oacct-test.epfl.ch/api/country/'
-r2 = requests.post(url, auth=(oacct_login, oacct_pwd), headers=headers, data=country)
-print(r2)
-
-
-# In[15]:
-
-
-print(r2.json())
-
-
-# In[16]:
-
-
-country2 = {
- "id": 1000000,
- "name": "Sildavie3",
- "iso_code": "II"
-}
-
-
-# In[17]:
-
-
-# test avec put
-url = 'https://oacct-test.epfl.ch/api/country/1000000'
-r2 = requests.put(url, auth=(oacct_login, oacct_pwd), headers=headers, data=country2)
-print(r2)
-
-
-# In[18]:
-
-
-print(r2.json())
-
-
-# In[19]:
-
-
-# convert to json
-json_response = r2.json()
-print(json_response)
-
-
-# In[20]:
-
-
-# get the name
-name = json_response['name']
-name
-
diff --git a/import_scripts/README.md b/import_scripts/README.md
deleted file mode 100644
index 041865ab..00000000
--- a/import_scripts/README.md
+++ /dev/null
@@ -1,9 +0,0 @@
-Original IPython notebooks converted to pure Python scripts and Markdown documents:
-
-```
-ipython nbconvert --to script *.ipynb
-ipython nbconvert --to markdown *.ipynb
-```
-=> easier Git version control
-
-Snapshot on 2021-09-23 AB
diff --git a/sphinx/django_api.rst b/sphinx/django_api.rst
index c4c26c85..13445081 100644
--- a/sphinx/django_api.rst
+++ b/sphinx/django_api.rst
@@ -1,79 +1,79 @@
django\_api package
===================
-The django\_api package implements the admin backend and web service components of the OACCT application.
+The django\_api package implements the admin backend and web service components of the OACT application.
It uses the Django REST framework https://www.django-rest-framework.org/ and the standard Django admin site.
Subpackages
-----------
.. toctree::
:maxdepth: 4
Submodules
----------
django\_api.admin module
------------------------
.. automodule:: django_api.admin
:members:
:undoc-members:
:show-inheritance:
django\_api.apps module
-----------------------
.. automodule:: django_api.apps
:members:
:undoc-members:
:show-inheritance:
django\_api.models module
-------------------------
.. automodule:: django_api.models
:members:
:undoc-members:
:show-inheritance:
django\_api.serializers module
------------------------------
.. automodule:: django_api.serializers
:members:
:undoc-members:
:show-inheritance:
django\_api.tests module
------------------------
.. automodule:: django_api.tests
:members:
:undoc-members:
:show-inheritance:
django\_api.urls module
-----------------------
.. automodule:: django_api.urls
:members:
:undoc-members:
:show-inheritance:
django\_api.views module
------------------------
.. automodule:: django_api.views
:members:
:undoc-members:
:show-inheritance:
Module contents
---------------
.. automodule:: django_api
:members:
:undoc-members:
:show-inheritance:
diff --git a/sphinx/index.rst b/sphinx/index.rst
index 482678e2..7dc1cd0e 100644
--- a/sphinx/index.rst
+++ b/sphinx/index.rst
@@ -1,29 +1,29 @@
-.. OACCT documentation master file, created by
+.. OACT documentation master file, created by
sphinx-quickstart on Mon Sep 6 14:29:45 2021.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
-Welcome to OACCT's documentation!
+Welcome to OACT's documentation!
=================================
.. toctree::
:maxdepth: 2
:caption: Contents:
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
* :doc:`sphinx_howto`
Contents
--------
.. toctree::
modules
sphinx_howto
diff --git a/sphinx/modules.rst b/sphinx/modules.rst
index 8fca2f03..e73be6f6 100644
--- a/sphinx/modules.rst
+++ b/sphinx/modules.rst
@@ -1,9 +1,9 @@
-open-access-compliance-check-tool-oacct
+open-access-check-tool-oact
=======================================
.. toctree::
:maxdepth: 4
django_api
django_app
manage
diff --git a/static/assets/by-nc-sa.png b/static/assets/by-nc-sa.png
new file mode 100644
index 00000000..b9a55533
Binary files /dev/null and b/static/assets/by-nc-sa.png differ
diff --git a/styleguide.config.js b/styleguide.config.js
index 173e4ac3..1fe9bb61 100644
--- a/styleguide.config.js
+++ b/styleguide.config.js
@@ -1,89 +1,89 @@
// const path = require('path');
module.exports = {
- title:"OACCT Documentation",
- version:"0.0.1",
+ title:"OACT Documentation",
+ version:"1.0",
theme: {
color: {
link: '#3771C8',
linkHover: '#D40000'
},
fontFamily: {
- base: '"Helvetica Neue", cursive'
+ base: 'Helvetica, Arial, sans-serif'
}
},
styleguideDir:"reactDoc/styleguide",
// styles: {
// Logo: {
// // We're changing the LogoRenderer component
// logo: {
// // We're changing the rsg--logo-XX class name inside the component
// // animation: '$blink ease-in-out 300ms infinite'
// // },
// // '@keyframes blink': {
// // to: { opacity: 0 }
// // }
// }
// },
sections: [
{
name: 'Introduction',
content: './assets/docs/introduction.md'
},
{
name: 'Documentation',
sections: [
{
name: 'Installation',
content: './assets/docs/installation.md',
description: 'The description for the installation section'
},
{
name: '[Django]Backend Configuration',
description: 'The description for the backend section using Django',
content: './assets/docs/backend_configuration.md'
},
{
name: '[React]Frontend Configuration',
description: 'The description for the ins Frontend section using React.js',
content: './assets/docs/frontend_configuration.md'
},
{
name: 'Live Demo',
external: true,
href: 'https://oacct-test.epfl.ch/#/'
}
]
},
{
name: 'React UI Components',
// content: 'docs/ui.md',
components: ['./assets/src/pages/**/*.js','./assets/src/components/**/*.js'],
exampleMode: 'expand', // 'hide' | 'collapse' | 'expand'
usageMode: 'expand' // 'hide' | 'collapse' | 'expand'
},
{
name: 'React context',
// content: 'docs/ui.md',
components: './assets/src/ContextProvider.js',
content: './assets/src/ContextProvider.md',
exampleMode: 'expand', // 'hide' | 'collapse' | 'expand'
usageMode: 'expand' // 'hide' | 'collapse' | 'expand'
},
{
name: 'API',
content: './assets/src/services/api.md',
exampleMode: 'expand', // 'hide' | 'collapse' | 'expand'
usageMode: 'expand', // 'hide' | 'collapse' | 'expand'
sections: [
{
name: 'Requests',
content: './assets/src/services/requests/requests.md',
description: 'What are the requests syntax?'
},
]
}
]
-}
\ No newline at end of file
+}
diff --git a/templates/admin/are_you_sure.html b/templates/admin/are_you_sure.html
new file mode 100644
index 00000000..0eb9e112
--- /dev/null
+++ b/templates/admin/are_you_sure.html
@@ -0,0 +1,19 @@
+{% extends "admin/base_site.html" %}
+
+{% block content %}
+
+
+{% endblock %}
\ No newline at end of file
diff --git a/templates/assets/index.html b/templates/assets/index.html
index 923ff045..dff7dc9a 100644
--- a/templates/assets/index.html
+++ b/templates/assets/index.html
@@ -1,39 +1,39 @@
{% load static %}
- OACCT | Test Version
+ OACT
You need to enable JavaScript to run this app.