diff --git a/modules/websearch/doc/Makefile.am b/modules/websearch/doc/Makefile.am index 5fd635c31..6b01612f9 100644 --- a/modules/websearch/doc/Makefile.am +++ b/modules/websearch/doc/Makefile.am @@ -1,31 +1,35 @@ ## $Id$ ## This file is part of the CERN Document Server Software (CDSware). ## Copyright (C) 2002 CERN. ## ## The CDSware is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## The CDSware is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDSware; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. SUBDIRS = admin hacking docdir = $(WEBDIR)/help/search -doc_DATA=tips.html index.html +doc_DATA=tips.en.html index.en.html index.fr.html EXTRA_DIST = $(wildcard *.wml) CLEANFILES = $(doc_DATA) *~ *.tmp -%.html: %.html.wml ../../../config/config.wml ../../../config/configbis.wml - $(WML) -o $@ $< \ No newline at end of file +%.en.html: %.html.wml ../../../config/config.wml ../../../config/configbis.wml + $(WML) -o\(ALL-LANG_*\)+LANG_EN:$@ $< + +%.fr.html: %.html.wml ../../../config/config.wml ../../../config/configbis.wml + $(WML) -o\(ALL-LANG_*\)+LANG_FR:$@ $< + diff --git a/modules/websearch/doc/index.html.wml b/modules/websearch/doc/index.html.wml deleted file mode 100644 index 5e98474b8..000000000 --- a/modules/websearch/doc/index.html.wml +++ /dev/null @@ -1,37 +0,0 @@ -## $Id$ - -## This file is part of the CERN Document Server Software (CDSware). -## Copyright (C) 2002 CERN. -## -## The CDSware is free software; you can redistribute it and/or -## modify it under the terms of the GNU General Public License as -## published by the Free Software Foundation; either version 2 of the -## License, or (at your option) any later version. -## -## The CDSware is distributed in the hope that it will be useful, but -## WITHOUT ANY WARRANTY; without even the implied warranty of -## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -## General Public License for more details. -## -## You should have received a copy of the GNU General Public License -## along with CDSware; if not, write to the Free Software Foundation, Inc., -## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. - -#include "cdspage.wml" \ - title="Search Help" \ - navbar_name="search-new" \ - navtrail_previous_links="/help/>Help Central" \ - navbar_select="tips" - -

Find out all about searching : - -

-
- -
Search Tips - -
This page presents you with useful tips and techniques in order to - help you use the site to the full. - -
-
diff --git a/modules/websearch/doc/tips.html.wml b/modules/websearch/doc/tips.html.wml deleted file mode 100644 index 11c757826..000000000 --- a/modules/websearch/doc/tips.html.wml +++ /dev/null @@ -1,930 +0,0 @@ -## $Id$ - -## This file is part of the CERN Document Server Software (CDSware). -## Copyright (C) 2002 CERN. -## -## The CDSware is free software; you can redistribute it and/or -## modify it under the terms of the GNU General Public License as -## published by the Free Software Foundation; either version 2 of the -## License, or (at your option) any later version. -## -## The CDSware is distributed in the hope that it will be useful, but -## WITHOUT ANY WARRANTY; without even the implied warranty of -## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -## General Public License for more details. -## -## You should have received a copy of the GNU General Public License -## along with CDSware; if not, write to the Free Software Foundation, Inc., -## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. - -#include "cdspage.wml" \ - title="Search Tips" \ - navbar_name="search-new" \ - navtrail_previous_links="/help/>Help Central > /help/search/>Search Help" \ - navbar_select="tips" - -

Our search engine tries to offer today's typical web searching -experience, as gained with popular search engines such as Google. The nature of bibliographic -searching differs from that of a web page searching, though. We -provide many extensions to enable a complex and precise structured -search, including an combined metadata, fulltext and reference search -in one go. This page lists several tips and tricks that you may find -useful to this effect. - -

-    Simple versus advanced search -
    Search guidance -
    Searching for words versus phrases -
    Boolean queries -
    Special characters and punctuation -
    International characters -
    Word truncation/stemming -
    Structured metadata search -
    Span queries -
    Combined metadata/fulltext/citation search -
    Frequently asked questions -
        How to wisely choose your search terms (speed-wise) -
        How to produce list of your publications -
        How to sort according to a certain pattern -
        How to get documents from other servers (Google, SPIRES, KEK) -
        How to search in fulltext files -
        How to search for citations - -

Simple versus advanced search

- -

The default search mode is simple search that -basically provides you with one input box where you can type your -query, followed by a possibility to choose one of the common indexes -to search within. You would usually simply type the keywords you are -interested in and hit return. For example, if you are interested in -documents on standard model from Ellis, you would -type: - -

-
- - - -
-
- -and on the search results page you would further add/remove keywords -to get at what you were looking for. - -

The advanced search interface provides you with -explicit tools to play with: you can change the matching type from -the default word matching to phrase searching or the regular matching; -you can use boolean queries in several indexes, etc. For example, to -find all the documents written by Ellis, J that contain -either of the words muon or neutrino in the title -and that were published in 2001, you would type: - -

-
- - - - - - - - - - - - - - - - - - - - - - - -
- - - -
- - - - -
- - -  
-
-
- -

Note that Simple Search can provide you basically the same -functionality, if you make use of special syntax that is explained in -the text below. The simple-versus-advanced does not refer to the -functionality that is being provided but rather to the amount of -parametrization you can "tweak". We conform to the common -use of the simple/advanced terms as found in other search engines. - -

Much of what follows will deal with a question on "how a power user -would use the simple search interface". Recall that you can always go -to the Advanced Search for more query assistance. - -

Search guidance

- -

After you submit your query, the search engine will analyze it and -will try to always guide you in case no exact match could be found. -For example, it would print you a list of closest indexed terms in -case of spelling troubles: - -

-
- - - -
-
- -

An alternative choices will be printed in red. The search engine -will similarly and will warn you when your search terms could not be -found, or when they could but your boolean query couldn't be met. The -search engine will also silently try to search for alternative forms -(e.g. removed punctuation), etc. - -

Thanks to multiple search stages and the guidance provided at each -stage, it is usually sufficient to simple type what you are looking -for and see what the system says in return. If you aren't satisfied, -you would then add/remove words from your query until the satisfactory -reply. - -

Searching for words versus phrases

- -

The default search mode is a search for words. This -means that any whitespace you type is not significant, but is rather -interpreted to mean "add an automatic boolean AND between words", like -Google does. For example, to find all records that contain both the -word ellis and the word muon anywhere in the record, -type: - -

-
- - - -
-
- -The whitespace would be significant if you include it within quotes. -There are two phrase searching modes: - -
    - -
  1. The double quotes instruct the search engine to search for - exact phrase. This phrase search mode will match if and - only if the given metadata field is exactly equal to the input - pattern. For example, to find all the documents by Ellis, - J, type: - -
    -
    - - - -
    -
    - -
  2. The single quotes instruct the search engine to search for - partial phrase. Unlike exact phrase search, this mode - will find "subphrases" that match given pattern, thus allowing - for some words before/after given text. This is the mode Google - and other fulltext engines call a "phrase search". For example, - to find all the titles containing the expression muon - decay regardless of the position of the expression in the - title, type: - -
    -
    - - - -
    -
    - -
- -The difference between exact and partial phrase searches may not be -obvious. It's good to remember that the search engines usually offer -the latter. We offer the two at the moment, since the first one is -usually an order of magnitude faster. - -

Boolean queries

- -We have already seen how whitespace adds a silent boolean AND in the -search for words. The other boolean operators include: - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-+
AND
-
- ellis +muon - - matches all records that contain both the word - ellis and the the word muon -
- ellis muon - - ditto, syntactic sugar -
- ellis and muon - - ditto, syntactic sugar -
--
NOT
-
- ellis -muon - -matches all records that contain the word - ellis but that do not contain the word - muon -
- ellis not muon - -ditto, syntactic sugar -
-|
OR
-
- ellis |muon - -matches all records that contain at least one - of the words -
- ellis or muon - -ditto, syntactic sugar -
-
- -

Logical operations are automatically chained from left to right (no -parenthesis support at the moment). This permits you to easily refine -your searching by adding/removing words with +,- signs. For example, -to find the documents including words muon or kaon, as well as with the -word ellis, type: - -

-
- - - -
-
- -to get, say, 100 hits. Now if you want to exlude records dealing with -the decay, append the exclusion term at the end: - -
-
- - - -
-
- -to get, say, 70 hits in a refined list. Keep adding/removing terms -until the satisfaction. - -

Note again that a left-to-right boolean chaining means that, if you -type ellis muon or kaon you will be effectively searching -for a pseudo-expression "(ellis and muon) or kaon". A search for -"ellis and (muon or kaon)" is to be written as muon or kaon -ellis. - -

Special characters and punctuation

- -

When indexing words, an attention is paid to index it both with and - without punctuation, so that you should be able to search for terms - containing special characters, such as C++, verbatim: - -

-
- - - -
-
- -
-
- - - -
-
- - For example, to find records containing the LaTeX expression - $e^{+}e^{-}$ in the title, type: - -
-
- - - -
-
- - For example, to find document with the report number - hep-ph/0204133, type: - -
-
- - - -
-
- - Note that the search is case-insensitive: - -
-
- - - -
-
- -

International characters

- -

The search engine works with Unicode UTF-8 so you can type your - query strings in any language stored in the database. For - example, to find the documents written by (or on) Пушкин, type: - -

-
- - - -
-
- - Note that you don't have to type accents to find accented results. For example, - type Lemaitre to find papers by Lemaître: - -
-
- - - -
-
- -" "1" " -

- - - - - - - - - - -
- IMPORTANT NOTE FOR THE CERN SITE -
- At the moment, words including accented characters can only be retrieved by entering - accented characters in the query. -
-"> - -

Word truncation/stemming

- -

The word truncation is supported via asterisk (*) wildcard - character. The wildcard instructs the search engine to match any - number of characters in that place. For example, to find records - that contain words muon, muons, muonic - etc, type: - -

-
- - - -
-
- - The wildcard query works both in prefix and infix position. For - example, to get all the words that start by CERN-TH and - end by 31, type: - -
-
- - - -
-
- - Note that the wildcard will be ignored if you try to apply it to - very short words, such as a*: - -
-
- - - -
-
- - The wildcard character can be used also in the phrase searching - mode. For example, to find all the documents whose title starts by - "Neutrino mass", type: - -
-
- - - -
-
- - Recall that we have introduced exact and partial phrase search - modes. Actually, a partial phrase search mode launches an exact - search enclosed within wildcards: we could say that 'foo bar - baz' equals to "*foo bar baz*". Now you can - see why the partial phrase search is slow: due to the usage of two - asterisks in front and after the text, each and every title in the - database has to be looked up to determine whether it matches or - not. (There are currently no partial phrase indexes.) - -

Structured metadata search

- -

Searching within various bibliograpic fields (such as title, - author) is supported via Google's "site:" like syntax. - If a search term is preceded by a field name and a colon, then the - term is searched for inside this field only. For example, to find - documents containing the word ellis within author index, - type: - -

-
- - - -
-
- - To select documents written by Ellis that contain words - like muon, muons, muonic within title, - type: - -
-
- - - -
-
- - The most common fields you may want to use are - author, title, - reportnumber, abstract, - keyword, year, fulltext, - and reference. - -

Span queries

- -

The span query is provided via a -> sign. For -example, to search for all documents on muon decay published -between 1983 and 1992, type: - -

-
- - - -
-
- -To find all documents by authors with names ranging from Ellis, -J to Ellis, Qqq, type: - -
-
- - - -
-
- -

Combined metadata/fulltext/citation search

- -

All the syntax mentioned above can be combined together in one - query. For example, to find documents that have the word - ellis inside author fields, that do not contain words like - muon, 'muonic' etc in any field, that contain the phrase - (or the substring, to be more precise) 'dense quark matter' inside - abstract fields, and that were published in year starting by digits - '200', type: - -

-
- - - -
-
- - Note that the default "any field" global index does contain only the metadata terms, - not the citation nor fulltext terms. You have to explicitely mention fulltext - or reference index to search there. For example, to find the term Higgs - in either metadata, references or fulltext files, type: - -
-
- - - -
-
- - This permits an interesting combination of metadata, fulltext and citation search in - the same query. For example, to get all documents written by - Lin whose fulltext files contain the words - Schwarzschild and AdS, and who cite journal - Adv. Theor. Math. Phys., type: - -
-
- - - -
-
- -" "1" " -

- - - - - - - - - - -
- IMPORTANT NOTE FOR THE CERN SITE -
- At the moment, fulltext files and references are not fully searchable on the CERN site. - Assumed operational time: Q1 2004. -
-"> - - -

Frequently asked questions

- -

How to wisely choose your search terms (speed-wise)

-

-

- -

How to produce list of your publications

-

-The author names are usually stored in a form with initials only such as Ellis, J. -To get the list of publications in this case, type: - -

-
- - - -
-
- -Sometimes the first name may be spelled in full, such as Ellis, John. -To get the list of publications for both forms at the same time, you could use a -wildcard query: - -
-
- - - -
-
- -or a partial phrase matching: - -
-
- - - -
-
- -The difference is the following. The former technique (double quotes with a wildcard) matches any author whose -name starts by the string Ellis, J, so that it can match names like -Ellis, J, Ellis, John, Ellis, Jonathan Richard, etc. -The latter technique (single quotes) matches any author whose name contains the string -Ellis, J so that in addition to previous matches you have also obtained a match for De Lellis, G. - -

Note that the latter kind of searching may be useful especially -in case of compound family names. For example, imagine that -Pepe-Altarelli, M is spelled on some documets as Altarelli, M. -If you then type: - -

-
- - - -
-
- -you will obtain hits for the both forms (and possibly more, as already discussed above). - -

It the two cases mentioned above we have seen how important it is to pay attention to false positives. -To include only the author name versions you are interested in, you can use a boolean OR query: - -

-
- - - -
-
- -

Note that this may still lead to false positives, in case the abbreviated form -Ellis, J represents in addition a distinct person such as Ellis, Jim. -There is no way to distinguish them automatically. You may want to contact the -administrators of -who will take care of revising the author names with you so that the database would -contain a consistently spelled and properly formatted full name instead of just the -initials. - -

How to sort according to a certain pattern

- -

You may select a certain field according to which sort the search - results, for example to sort the results by main title. However, - sometimes you may want to sort by a report number and it happens - that your documents have several of them. For example, the report - numbers hep-ph/0204140, CERN-TH-2002-069 and - RM3-TH-02-4 all denote the - same document. Now if you sort your search results set - containing this document, the system will take into consideration - the first report number, that may be either of these three. - Sometimes you may want to classify this document under its - hep-ph number, sometimes under its CERN number, - depending on whether you produce a list of CERN or hep-ph - publications. How can you influence the search engine to prefer - one report number rather than the other? - -

In other words, the search engine by default answers a query - like "sort by first author" or "sort by first report number", but - sometimes you may want to ask the search engine to "sort by first - report number that starts by the text CERN-". The latter - possibility is available via a "silent" sort parameter called - sp (for "sort pattern") that sorts preferentially - according to the given textual pattern if they can be found. The - parameter is "silent" in a way that it is not present in the search - interface, you have to add it manually to your search URL. - - For example, to get all CERN-TH publications of the year 2001 - sorted by their CERN-TH numbers, you would search for - CERN-TH-2001* within reportnumber index, - and on the search results page, being satisfied with the results, - you would add &sp=CERN-TH to the URL to sort the - results preferentially by CERN-TH report numbers, to get a nicely - sorted list of all CERN-TH 2001 publications. - -

How to get documents from other servers (Google, SPIRES, KEK)

- -

On the search results page, links to other servers like Google, SPIRES or KEK are -automatically proposed in a box entitled "Try your search on". You -can simply click on the proposed links to run your query on these -search engines. - -

Note that the links aren't printed if the search engine doesn't -support it. For example, SPIRES or KEK cannot search for terms within -"any field", so we don't link to them in these cases. - -" "1" " - -

Note also that KEK has scanned a lot of old CERN reports. If - you find that we don't have fulltext to some old CERN report, it - may be worthy to look there. For example, search for CERN - ISR-MA/73-17 in our system: - -

-
- - - -
-
- - and you will see that CDS contains the document in the archives only, i.e. not in a electronic format. - However, if you follow the proposed KEK search link, - you will see that KEK proposes "scanned images" that you can download. -"> - -

How to search in fulltext files

- -

If a metadata record contains some associated fulltext files, -tries to extract the textual information from the files and index it into a separate fulltext index. -To search for all records that contain the term e- in their fulltext files, -type: - -

-
- - - -
-
- -Recall that fulltext words aren't included in the default global ``any field'' index, -but that you may freely combine a fulltext and metadata search. For example, to find all -articles written by Ellis that contain the word muon either in the -metadata or in the fulltext, type: - -
-
- - - -
-
- - -" "1" " -

- - - - - - - - - - -
- IMPORTANT NOTE FOR THE CERN SITE -
- At the moment, the fulltext indexes aren't available on the CERN site. - Assumed operational time: Q1 2004. - Please use the - old fulltext interface - instead in the meantime. -
-"> - -

How to search for citations

- -

If a metadata record contains an associated fulltext file, -tries to extract references automatically from that file and index -them into a separate reference index. To search for -all records that cite Ellis in their reference lists, -type: - -

-
- - - -
-
- -To search for all records that cite preprint hep-ph/0103062 -in their reference lists, type: - -
-
- - - -
-
- -To search for all records that cite an article from Giddings and Ross published in -Physical Review D in volume 61 in year 2000, type: - -
-
- - - -
-
- -Recall that citation terms aren't included in the default global "any field" index, -but that you may freely combine a citation search with a metadata search. -For example, to find all articles on standard model that aren't written by -Ellis but that do cite him, type: - -
-
- - - -
-
- -" "1" " -

- - - - - - - - - - -
- IMPORTANT NOTE FOR THE CERN SITE -
- At the moment, the reference indexes aren't available on the CERN site. - The citation search is therefore impossible at the moment. - Assumed operational time: Q1 2004. -
-"> -