The administrator can define a list of words which, if not in an exact phrase, will be ignored. These are typically conjunctions and direct and indirect articles. Additionally, the default conjunction (OR or AND) between words can be defined. See the Admin > Configure > Resource lists interface.
There are two types of search available:
A set number of database fields are searched:
Partial word searches are the default unless the search term is an exact phrase
You can use control words as noted below
The restrictions noted below on searches using the abstract and note fields pertain here
1 OR 2 AND 3 NOT 4 OR 5
will be grouped as 1 OR (2 AND 3 NOT 4) OR 5
In both types of search, the following rules hold for the word search phrase:
You can use the control words AND, OR and NOT and can group words into exact phrases using double quote marks
AND, OR and NOT are case-sensitive and function as control words only outside exact phrases
The wildcard characters ? (zero or one character) and * (zero or multiple characters) can be used. In an exact phrase, these characters will treated literally
The use of wildcard characters disables partial word matching
The wildcard ? will not match a single UTF-8 character due to the multibyte nature of UTF-8. Use * instead
Searches are case-insensitive
A space not in an exact phrase will be treated as OR
All non-alphanumeric characters (such as punctuation) not in an exact phrase will be ignored unless the character is a wildcard
OR words following AND or NOT will be grouped. You might choose, therefore, to have a string of OR words at the start of the phrase. Some examples:
word1 AND word2 OR word3 OR word4 NOT word5 word6
// gives
word1 AND (word2 OR word3 OR word4) NOT (word5 OR word6)
word1 word2 OR word3 word4 NOT word5 word6 AND word7
// gives
word1 OR word2 OR word3 OR word4 NOT (word5 OR word6) AND word7
NOT word1 word2 OR word3 OR word4 NOT word5 word6
// gives
NOT (word1 word2 OR word3 OR word4) NOT (word5 OR word6)
You can attach files of any type to resources. For those that are text-type documents, a small number can be converted to text and cached for fulltext search from within Advanced Search. Following is a list of the major text-type document formats and their caching support for fulltext search.
The documents are analyzed according to their mime-type and then according to their extension if there is any ambiguity.
The plain/text
mime-type is a generic format that covers a multitude of files.
As the search targets written documents, attachments with the following extension are excluded: CSV, TSV, SYLK.
Then encoding is assumed to be UTF-8 only, unless the format specification says otherwise.
Better PDF extraction quality requires the xpdftotext plugin.
Extracting PS (PostScript) files requires the ps2pdf converter included in Ghostscript.
Extracting DVI (DeVice Independent) files requires the catdvi converter included in TeX Live and others TeX distributions.
Extracting DJV (DjVu) files requires the djvutxt converter included in DjVuLibre.
Extension | Kind of document | MIME Type |
---|---|---|
ABW, ZABW | AbiWord Document | application/x-abiword |
AWT | AbiWord Document Template | application/x-abiword |
DJV, DJVU | DjVu Document | image/vnd.djvu, image/x-djvu |
DOC | Word 97-2003 / DOS Word | application/msword |
DOCM | Word 2007-365 document+macro | application/vnd.ms-word.document.macroEnabled.12 |
DOCX | Word 2007-365 document | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
DOT, WPT | Word 97-2003 / DOS Word Template | application/msword |
DOTM | Word 2007-365 template+macro | application/vnd.ms-word.template.macroEnabled.12 |
DOTX | Word 2007-365 template | application/vnd.openxmlformats-officedocument.wordprocessingml.template |
DVI | DeVice Independent | application/x-dvi |
EPUB | Electronic publication | application/epub+zip |
FB1, FB2 | FictionBook 1.0 and 2.0 | application/x-fictionbook (private mimetype) |
FODP | ODF Presentation Flat | application/vnd.oasis.opendocument.presentation |
FODT | ODF XML Text Document Flat | application/vnd.oasis.opendocument.text |
HTML, HTML | HyperText Markup Language | text/html |
MD | Markdown | text/markdown |
MHT, MHTML | Multipart HTML | multipart/related, multipart/alternative, multipart/x-mimearchive, multipart/mixed, message/rfc822 |
ODP | ODF Presentation | application/vnd.oasis.opendocument.presentation |
ODT | ODF Text Document | application/vnd.oasis.opendocument.text |
OTP | ODF Presentation Template | application/vnd.oasis.opendocument.presentation-template |
OTT | ODF Text Template | application/vnd.oasis.opendocument.text-template |
Portable Document Format | application/pdf | |
POTM | PowerPoint 2007-365 Template+macro | application/vnd.ms-powerpoint.template.macroEnabled.12 |
POTX | PowerPoint 2007-365 Template | application/vnd.openxmlformats-officedocument.presentationml.template |
PPTM | PowerPoint 2007-365 +macro | application/vnd.ms-powerpoint.presentation.macroEnabled.12 |
PPTX | PowerPoint 2007-365 | application/vnd.openxmlformats-officedocument.presentationml.presentation |
PS, EPS | PostScript | application/postscript |
RST, REST | reStructured text | text/plain |
RTF | Rich Text Format 1.9.1 | application/rtf or text/rtf |
SLA | Scribus Document | application/vnd.scribus |
STI | OpenOffice.org 1.0 Presentation Template | application/vnd.sun.xml.impress.template |
STW | OpenOffice.org 1.0 Text Template | application/vnd.sun.xml.writer.template |
SXI | OpenOffice.org 1.0 Presentation | application/vnd.sun.xml.impress |
SXW | OpenOffice.org 1.0 Text Document | application/vnd.sun.xml.writer |
TXT, others | Plain text | text/plain |
XHTML | Extensible HyperText Markup Language | application/xhtml+xml |
XML | Extensible Markup Language | application/xml or text/xml |
XPS, OXPS | XML Paper Specification | application/vnd.ms-xpsdocument |
Many old or rare office suite formats will not be directly supported. They are appointed to remove any ambiguity. Consider converting them to a supported format before attaching them. Many free converters are available only.
DRM protected ebooks and password protected documents are not supported.
Extension | Kind of document | MIME Type |
---|---|---|
CWK | ClarisWorks/AppleWorks Document | |
HWP | Hangul WP 97 | |
KWD | KWord | application/vnd.kde.kword |
LRF | BroadBand Ebook | |
LWP | Lotus WordPro | application/vnd.lotus-wordpro |
MAN, MDOC | Manpage, mandoc | text/troff |
MW, MCW | MacWrite Document | |
MWD | Mariner Mac Write Classic | |
PAGES | Apple Pages | |
PDB | PalmDoc | |
PDB | Plucker eBook | |
POT | PowerPoint 97-2003 Template | application/vnd.ms-powerpoint |
PPT | PowerPoint 97-2003 | application/vnd.ms-powerpoint |
SDD | StarOffice presentation | application/vnd.stardivision.impress, application/x-starimpress |
SDW | StarOffice Document | application/vnd.stardivision.writer, application/x-starwriter |
TEI | Text Encoding Initiative | application/tei+xml |
TEX, LATEX | TeX, LaTeX | |
TEXI | TexInfo File | |
TROFF, ROFF | Groff, Roff, Troff | text/troff |
UOF, UOT | Unified Office Text | |
UOP | Unified Office presentation | |
WML | Wireless Mark-up Language | text/vnd.wap.wml |
WMLC | Wireless Mark-up Language | application/vnd.wap.wmlc |
WN | WriteNow Document | |
WPD | Wordperfect | application/vnd.wordperfect or application/wordperfect5.1 |
WPS | Microsoft Works | application/vnd.ms-works |
WRI | Microsoft Write | application/mswrite |