Roget’s Thesaurus 1911 Edition of English Words and Phrases

Valid HTML 4.0 Transitional    Valid CSS

ABOUT

A complete hypertext linked thesaurus for any web browser. This distribution was assembled by Nicholas Shea from the Project Gutenberg’#22 file ‘roget15a.txt’ (MICRA’s contributed eBook of Roget’s Thesaurus).

This is a work in progress, as I am checking the entire theasurus, *word for word* with the original 1911 edition. I aim to correct and improve about 100 heads per week. This work involves:

1. Rectiftying OCR errors.
2. Colouring words and phrases according to language.
3. Verifying, referencing and translating the existing phrases and quotations.
4. Enriching the text with additional phrases and quotations in Latin, Dutch, Spanish, Danish, Portugese, French, etc.

LICENSE

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

BROWSE

View the Classes, Index and Heads in your web browser. The website version is usually more recent than the download version.

DOWNLOAD

Rogets_Thesaurus_1911__UTF8_v1-4-6.zip
MD5 SUM:  2731E0E9DB9CC01A8DB0640D47663EF6
2.99 MB (3,140,060 bytes)
Uploaded on Wed Oct 13th 17:03 BST 2021

DETAILS

The thesaurus has been divided into individual HTML head files. Each head has a navigation panel at the top with links to‘previous’ and ‘next’ entries, as well as links to the Class breakdown and Index. This is a substantial improvement over a single body file and allows for further hierarchical structuring. Further, all Head numbers within the body text are linked to their respective head files.

The index was assembled from the Project Gutenberg file ‘10681-h-index-pos.’ This file was so vast that it took over 40 seconds to load in a browser. So the index has been divided into separate files, (A.html to Z.html), and the broken links have been corrected.

The ‘heads’ directory contains all heads extracted from the body as HTML files. All heads are linked. Navigate to the previous and next Heads using the Navigation panel.

The ‘index’ directory contains the entire index split alphabetically into separate HTML files. The original index was over 9.0MB and took a long time to load; it also had many link errors which have been fixed.

OBSOLETE WORDS

Words marked with a superscript dagger are considered obsolete (although what is considered obsolete is often a matter of conjecture or context). For example, these words are considered obsolete: unhoused, unharbored. Words considered archaic are marked with a superscript double dagger. For example, saturity.

GENERATOR PROGRAM

The ‘generator’ directory contains a small program called ‘roget’ with the source code. This program will parse the custom file ‘roget15a_my_markup.txt’, apply HTML styles and generate the head and index files; the program also generates links and navigation entries for each head. If you wish to rebuild the program, please consult the file 'ReadMe.txt' in the generator directory. IMPORTANT: I have spent hundreds of hours correcting the original OCR text from the Project Gutenberg text file, which had corrupted/omitted many accented characters. The file 'roget15a_my_markup.txt' is in UTF8 format. Do not change the codepage, or the accented characters will be lost. All the index files are also in UTF8 format and should not be tampered with.

The functions in MakeLinks.cpp were coded specifically for my custom file 'roget15a_my_markup.txt'. Much of the code was a KLUDGE to get my HTML thesaurus up and running. Makelinks.cpp was never intended for general release and is included only for the curious.

Note: the ‘generator’ has been removed from the current distribution but is still available on my Roget’s Thesaurus Sourceforge page.

NOTE

Some accented characters have suffered from the original Project Gutenberg OCR translation; the characters are either corrupt or missing entirely. Correcting these errors is an ongoing process that takes a great deal of time and effort. New versions are uploaded when available. Corrections and improvements are listed in the version history.

RANT

It beggars belief that people believe they can simply scan a book, then use Optical Character Recognition to convert that book into editable text without incurring errors. What is worse, they never check for these errors. Mathematical texts suffer very badly in this regard, with formulae rendered completely senseless and unreadable. (Often the original scans are very poor indeed). Ironically, the scanning of books is supposed to preserve them for future generations. However, without due diligence, the end result is not a preservation, but rather an erosion of our cultural heritage. I find this truly frightening.

VERSION HISTORY.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-4-5
Released on 05 OCT 2021
MD5 SUM: 6ABBF7D5C494C153FD352E723CFCA038 Rogets_Thesaurus_1911__UTF8_v1-4-5.zip

Heads 550 to 580
Corrected, referenced and appended Phrases, Proverbs and Quotations in various
languages. Other minor corrections and additions to previous heads.

This is *probably* the last upload before I finish the entire thesaurus, which
will be towards the end of 2021.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-3-9
Released on 28 SEP 2021
MD5 SUM: 5135ABFB79C0C73A6EEDF5DABB801B25 Rogets_Thesaurus_1911__UTF8_v1-3-9.zip

Heads 530 to 550
Corrected, referenced and appended Phrases, Proverbs and Quotations in various
languages. Other minor corrections and additions to previous heads.

This is *probably* the last upload before I finish the entire thesaurus, which
will be towards the end of 2021.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-3-4
Released on 25 SEP 2021

Heads 500 to 530
Corrected, referenced and appended Phrases, Proverbs and Quotations in various
languages. Other minor corrections and additions to previous heads.

This is *probably* the last upload before I finish the entire thesaurus, which
will be towards the end of 2021.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-3-2
Released on 14 SEP 2021

Heads 360 to 400
Corrected, referenced and appended Phrases, Proverbs and Quotations in various
languages. Some other minor corrections to previous heads.

This is *probably* the last upload before I finish the entire thesaurus, which
will be towards the end of 2021.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-3-1
Released on 07 SEP 2021
47FD58A6813A7586452C99566EDBAC4E  Rogets_Thesaurus_1911__UTF8_v1-3-1.zip

Heads 170 to 360
Corrected, referenced and appended Phrases, Proverbs and Quotations in various
languages. Some other minor corrections to previous heads.

This is *probably* the last upload before I finish the entire thesaurus, which
will be towards the end of 2021.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-2-9
Released on 07 SEP 2021
MD5 SUM: C363F0343B95E743C4BB4AC9E0318494  Rogets_Thesaurus_1911__UTF8_v1-2-9.zip

Heads 170 to 360
Corrected, referenced and appended Phrases, Proverbs and Quotations in various
languages. Some other minor corrections to previous heads.

Quotes for Head 265 'Quiescence' were omitted in this release but are included
in the website version. Quotes for Head 360 'Death' contains minor typos where
'befall' was 'befal' and another double 'll' was typed as 'lL'. This is also
corrected in the website version.

This is *probably* the last upload before I finish the entire thesaurus, which
will be towards the end of 2021.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-2-5
Released on 05 SEP 2021
MD5 SUM: B28F2CCE4B705272DC8983D90829EDB1  Rogets_Thesaurus_1911__UTF8_v1-2-5.zip

Heads 140 to 160
Verified, referenced and appended Phrases, Proverbs and Quotations in various
languages.

Note: I missed updating the quotes for Head 151 'Eventuality' in this release;
and in appending Chesterton and Francis Bacon quotes to Head 989 'Irreligion',
I made a typo for the HTML code '’'. These errors have been fixed for the
pages displayed on my website.

This is *probably* the last upload before I finish the entire thesaurus, which
will be towards the end of 2021.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-2-4
Released on 03 SEP 2021

Heads 120 to 140
Verified, referenced and appended Phrases, Proverbs and Quotations in various
languages.

This is *probably* the last upload before I finish the entire thesaurus, which
will be towards the end of 2021.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-2-3
Released on 02 SEP 2021

Heads 100 to 120
Verified, referenced and appended Phrases, Proverbs and Quotations in various
languages.

This is *probably* the last upload before I finish the entire thesaurus, which
will be towards the end of 2021.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-2-2
Released on 01 SEP 2021

Heads 1 to 100
Verified, referenced and appended Phrases, Proverbs and Quotations in various
languages. This is no small task as the existing quotes and phrases in the
Project Gutenberg file 'Roget15a.txt' are not referenced; sometimes the author
of the quote is given, but the work from which the quote is taken is not; so all
these must be traced down to the original source, and then if needs be corrected
and referenced in full. Note that the 1911 edition has very few Phrases or quotes
and most of the ones given in 'Roget15a.txt' have been added by third parties.
I have greatly exended the Phrase section, providing numerous additional sources
that are useful to me personally, and will hopefully be so for others. With all
this in mind, I have restyled the thesaurus body and put the Phrase section in
its own bordered division.

Appending phrases and quotations is an ongoing process, so whilst the body has
been corrected to Head 450, quotes are still being added to as yet uncorrected
heads beyond 450. For example, Head 790 Restitution.

Extended colouring scheme for Danish, Dutch and Portugese.

There are simply too many specific changes, corrections and additions to list in
this file.

This is probably the last upload before I finish the entire thesaurus, sometime
before the end of 2021.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-2-1
Released on 30 AUG 2021

Heads 1 to 450.
Applied styles to quotes.

Head 125 Morning and Head 126 Evening.
Swapped Milton quote "at shut of evening flowers" from Morning to Evening and
corrected it to: "Just then return'd at shut of Evening Flours"
Added more relevant quotes from Milton.

Moved the French phrase "Between a dog and a wolf" to Head 422 Dimness, and also
added it to Head 665 Danger.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-2-0
Released on 29 AUG 2021

Heads 300 to 450.
Detailed corrections, comparing *word for word* with original 1911 edition.
Missing sections replaced and some others corrected.
Corrected various quotations, and added references. Added some Latin words and
provided numerous translations and explanations for the more obscure words and
phrases in various languages.
Extendended colouring scheme for German, Greek, Italian and Spanish.

Head 353 'Bubble', I have replaced the tenuous quote from Milton with one from
Alexander Pope that *actually* relates to the Head in question:

“And now a bubble burst, and now a world” {An Essay on Man, Epistle One
(‘Of The Nature and State of Man with Respect to The Universe’)}[Pope].
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-1-9
Released on 27 AUG 2021

Heads 200 to 300
Detailed corrections, comparing *word for word* with original 1911 edition.
Extendended colouring scheme for German, Greek, Italian and Spanish.
Many corrections for missing accents, incorrect spelling, with some additions to
(and removals from) the body text. For example:

head 1. Existence.
Changed phrase 'ergo sum cogito' to:
'Cogito, ergo sum' {“I think, therefore I am” (axiom formulated by Descartes)}

Head 217. Obliquity.
Added missing sections for V. Adj. and Adv.

Head 210. Summit.
Removed Phrase 'en flûte'
Moved Phr. 'fleur d'eau', capitalized word and added brief description.

Head 240. Form.
Changed [Science of form] 'morphism' to 'morphology'.
Added 'polymorph', 'polymorphic', 'polymorphous'.

Changed N.American spellings of 'color' and 'defense' to English spellings of
'colour' and 'defence'.

Head 250. Convexity.
Added small explanation for 'thank-ye-ma'am':
{diagonal earthen ridge in road used to divert excessive wash, causing a jolt
to those who rode over it in a cart.} [U.S.]

Head 255. Smoothness.
Moved phrases into Phr. section.
Removed phrase: 'slippery as coonshit on a pump handle.' 

Head 264. Motion.
Added translations for phrases.
Moved Goethe quote 'sich ein Charakter in dem Strom der Welt' to Head 5.
Moved Goethe quote 'ses bildet ein Talent sich in der Stille' to Head 893.
Added translations for each.
Added missing footnote from 1911 edition.

Head 268. Traveller
Removed 'condottiere' which is a soldier, and not a traveller per se.

Head. 276. Impulse.
Removed: 'boost [U.S.]; bunt, carom, clip y; fan, fan out; jab, plug *.'

Head 277. Recoil
Changed the quote from Newton to what he *actually states* in his Principia
(Axioms, or Laws of Motion). Law III:
'To every action there is always opposed an equal reaction: or the mutual
actions of two bodies upon each other are always equal, and directed to
contrary parts.'
Also added the correct reference in curly brackets.

Head 298. Food
Removed 'ichthyivorous' and addded 'ichthyophagous'

Improved quotations and added numerous refs.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-1-7
Released on 26 AUG 2021

Heads 100 to 200
Detailed corrections, comparing *word for word* with original 1911 edition.
Extendended colouring scheme for German, Greek, Italian and Spanish.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-1-5
Released on 24 AUG 2021

Heads 1 to 100
Detailed corrections, comparing *word for word* with original 1911 edition.
French words and phrases are now in blue, with Latin words and phrases in
golden brown.

Head 14. [Noncoincidence.] Contrariety.
changed erroneous link In Adj section 'hostile &c. 703.' to .hostile &c. 708'.

Head 22. [Thing copied.] Prototype.
Added Latin phrase: 'Exemplumque dei quisque est in imagine parva'
{Every man is a copy of God in miniature] [Lat.][MANILIUS, Astronomicon, IV].
Added Latin phrase: 'O! imitatores! servum pecus!' {Oh! ye imitators, a servile herd!}
(An allusion to the low position occupied by the plagiary and copyist).

Head 80. Normality.
Added missising words from the 1911 edition that are absent in
the original Project Guttenberg file(s) 'roget13.txt' and 'roget15a.txt'.

The aim is to correct 100 heads per upload and will post updates on a weekly
basis until all 1000 heads are completed.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-1-4
Released on 23 AUG 2021

Head 735. Adversity.
Added the missing Adverbs and Phrases sections that were mysteriously absent in
the original Project Guttenberg file(s) 'roget13.txt' and 'roget15a.txt'.

Head 845.
Corrected Latin phrase 'Davus sum non Aedipus' to 'Davus sum non Œdipus'
(the MICRA OCR had interpreted the Latin capital 'Oe' ligature as 'Ae'. 

Corrected 'blase' to 'blasé' in Index and various Head sections.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-1-3
Released on 23 AUG 2021

Head 20. renamed to Non-imitation (was previously titled noimitation)
Changed incorrect Latin "sui generalis uncommon" to:
sui generalis, infrequens, rarus, insolitus, inusitatus, egregius, unicus,
singularis, non vulgaris, parum consuetus

Head 23. Agreement.
Added Latin translation for:
rem acu tetigisti {you have hit the nail on the head}

Head 591. Printing.
Fixed display of fractional points.

Fixed mismatched italic html tags for Latin words/phrases

Coloured all Latin words so that they stand out.
TO DO: Verify all latin words and phrases.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-1-2
Released on 22 AUG 2021
Italicized over 1,392 latin words/phrases. This was a very tedious operation
that was completed manually. Removed italic style for square brackets.
Numerous other corrections. Added time stamp to footer.

I am trying to style the text using the same attributes found in the 1911
edition of Roget's Thesaurus; although the text in my obtained PDF differs
from the OCR.

TO DO: Italian text verification and styles. German text verificatuon and
styles.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-1-1
Released on 21 AUG 2021
Over 550 French OCR corrections with translations for the more obscure phrases.
Translations are shown in curly brackets, for example:

vouloir prendre la lune avec les dents {to take the moon with the teeth} [Fr.]
houppelande {medieval overgown which was widely worn in the late 14th century
and 15th century} [Fr.]
à rebours {against the grain, against nature} [Fr.]
Après nous, le déluge {After us, the flood} [Fr.] ... etc.

Many other corrections in the thesaurus body.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-0-9
Released on 19 AUG 2021
Some German character corrections.
Link corrections for Head 8. Circumstance.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-0-8
Released on 19 AUG 2021
Some Greek character corrections and additions.
Added version number to HTML headers.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-0-6
Released on 17 AUG 2021
More accented character corrections.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-0-5
Released on 11 AUG 2021
Corrections to the Project Gutenberg OCR errors for accented characters in the
thesaurus body and index, Eg crême de menthe, jeu de théâtre, coup de grâce,
coup de maître, bête noire, etc. I have attempted to correct the entire
Thesaurus, although it is possible a few errata are still existent.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-0-4
Released on 10 AUG 2021
Corrections for missing French characters and accents that were lost in the
original Project Gutenberg OCR scan conversions.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-0-3
Released on 10 AUG 2021
Made first attempt to correct original OCR translation errors for accented
characters and ported all Web pages to use UTF8 codepage.
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-0-2
Released on 7 AUG 2021
Fixed bug in the index generator code where some head numbers appended with
'a' or 'b' did not link correctly. (e.g., "777.a" or "737a").
--------------------------------------------------------------------------------
Rogets_Thesaurus_1911__v1-0-1
Released on 26 JUL 2014
First sourceforge release.
--------------------------------------------------------------------------------

Nicholas Shea, 29 August 2021