Linguistby finngruwierLinguist was originally written to provide an easy way for users to create a list of new candidate words to the existing spellcheck dictionary for their language. Just run the command "List Non-recognized Words" and you will get a list of all the words from your document that are not recognized during spellchecking. The extension has been enhanced with commands for making complete lists of all words in a text document - either as an alphabetical list or sorted after word frequency - and with a readability calculating function that expresses readability in a unit called Lix. This extension is written in Python. License: opensource | Read license Further product information: Product details
|
||
Comments
Does not work in any document
Hi!
This version 1.40 is really nice tool. Great! The only problem I have faced so far (after using it about one hour), is that calculations work differently in different documents. In some documents I can get all calculations (Unrecognized word, word list, word frequency), in some other documents only complete wordlist works. Only difference between documents is the number of header levels.
JusSi
Calculate iteration of each words
Hi, I modified the extension according to a request on a mailing-list:
http://www.nabble.com/Writer---Word-Frequency--to21302458.html
Now, the extension calculate also the iteration of each words.
Here is the modified extension:
http://www.nabble.com/file/p21341084/Linguist-1.2.2-wordscount.oxt
Cool!
I'll integrate that in the next "official" version.
Finn Gruwier Larsen
WrongWordsList macro
this extension works in a similar fashion as this macro.
http://user.services.openoffice.org/en/forum/viewtopic.php?f=20&t=1222&s...
did you use that macro for inspiration?
Sub WrongWordsList
Dim oDocModel as Variant
Dim oTextCursor as Variant
Dim oLinguSvcMgr as Variant
Dim oSpellChk as Variant
Dim oListDocFrame as Variant
Dim oListDocModel as Variant
Dim sListaPalabras as String
Dim aProp() As New com.sun.star.beans.PropertyValue
oDocModel = StarDesktop.CurrentFrame.Controller.getModel()
If IsNull(oDocModel) Then
MsgBox("There's no active document." + Chr(13))
Exit Sub
End If
If Not HasUnoInterfaces (oDocModel, "com.sun.star.text.XTextDocument") Then
MsgBox("This document doesn't support the 'XTextDocument' interface." + Chr(13))
Exit Sub
End If
oTextCursor = oDocModel.Text.createTextCursor()
oTextCursor.gotoStart(False)
oLinguSvcMgr = createUnoService("com.sun.star.linguistic2.LinguServiceManager")
If Not IsNull(oLinguSvcMgr) Then
oSpellChk = oLinguSvcMgr.getSpellChecker()
End If
If IsNull (oSpellChk) Then
MsgBox("It's not possible to access to the spellcheck." + Chr(13))
Exit Sub
End If
Do
If oTextCursor.isStartOfWord() Then
oTextCursor.gotoEndOfWord(True)
' Verificar si la palabra está bien escrita
If Not isEmpty (oTextCursor.getPropertyValue("CharLocale")) Then
If Not oSpellChk.isValid(oTextCursor.getString(), oTextCursor.getPropertyValue("CharLocale"), aProp()) Then
sListaPalabras = sListaPalabras + oTextCursor.getString() + Chr(13)
End If
End If
oTextCursor.collapseToEnd()
End If
Loop While oTextCursor.gotoNextWord(False)
If Len(sListaPalabras) = 0 Then
MsgBox("There are no errors in the document.")
Exit Sub
End If
oListDocFrame = StarDesktop.findFrame("fListarPalabrasIncorrectas", com.sun.star.frame.FrameSearchFlag.ALL)
If IsNull(oListDocFrame) Then
oListDocModel = StarDesktop.loadComponentFromURL("private:factory/swriter", "fListarPalabrasIncorrectas", com.sun.star.frame.FrameSearchFlag.CREATE, aProp())
oListDocFrame = oListDocModel.CurrentController.getFrame()
Else
oListDocModel = oListDocFrame.Controller.getModel()
End If
oTextCursor = oListDocModel.Text.createTextCursor()
oTextCursor.gotoEnd(False)
oListDocModel.Text.insertString (oTextCursor, sListaPalabras, False)
oListDocFrame.activate()
End Sub
WrongWordsList
No, I haven't seen that macro before. But as far as I can tell from the code, it basically does the same thing. It i easier, though, to install an extension than to install a piece of macro code. :-)
Finn
GUI language vs. document language.
that macro however does a better job since it retrieves just mispelled words.
your extension has problems becuase it checks GUI language not the document language.
i write in italian and Linguist selects almost any word.
i'll keep using the macro. you should fix that feature.
GUI language vs. document language.
I just installed an Italian dictionary and tried to run Linguist on an Itialian text after having set default document language to Italian. In a text with 145 words it just found 4 unknown words.
Just set default document language correctly in the Tools > Options > Languages menu, then it will work.
Btw: One of the four unknown words was 'Berlusconi' :-)
Install the Italian surnames dictionary
> Btw: One of the four unknown words was 'Berlusconi' :-)
this is correct because that is not an Italian word. It is a surname.
If you would like to have Italian surnames not reported as wrong words, then install the Italian surnames dictionary.
Actually you can do it only manually, but in the next days/weeks I will create an extension that will to all the job for OOo >= 3.0.
The dictionary can be found here, with the instruction to install it (only in Italian language):
http://linguistico.sf.net/wiki/doku.php?id=dizionario_cognomi_italiani
The site http://linguistico.sf.net is the home page of the Italian dictionary, Italian surname dictionary and Italian thesaurus
Ciao
Davide
OOo 3.0 beta: ‘List Unrecognized Words’ lists all words
I've installed the latest release of this extension and have run it on several documents with Russian as the language of the Default style. In all instances, the result is the same: ‘List Unrecognized Words’ lists all words, not only misspelled ones. Is this because my GUI language is English (as Russian is not available so far)?
Please check the new version...
Finn Gruwier Larsen
OOo 3.0 beta: ‘List Unrecognized Words’ lists all words
Thanks for your comment. I am sorry to say that the current version still checks the GUI language, not the document language. I have planned to make a new version that checks the document language instead. I will see if I can get it done soon.
Finn Gruwier Larsen
Nice tool!
I find this tool really useful!
Only, I would prefer if the statistics would open in a popup, instead of in a new document....
Thanks for this useful extension.
jo
New version ready
Version 1.1 adds readability (lix) measuring and other document statistics.
It doesn't seem to work
For big documents (not so big ones), the app freezes.
I tried with just one paragraph with few unrecognized words, it creates a new void document.
Please send a document that
Please send a document that doesn't work to finn (at) gruwier.dk.
Finn Gruwier Larsen
Language: Danish
Finn,
Great script for many reasons. But you forgot to say, that the language is Danish ;-)
Leif Lodahl
http://lodahl.blogspot.com
Language
Hi Leif,
It should definately work with other locales. As mentioned in the description, Linguist tests on the GUI locale setting to determine which locale to use for spellchecking. I've just verified that it works with US English. But there might be locales that it doesn't work with - especially if there are locales (other the Danish) that does not conform to the language-COUNTRY formula. In fact the script treats Danish as a special case, since for unknown reasons the GUI locale setting for Danish is not called 'da-DK', but just 'da'.
Finn
Document language
Couldn't Linguist check the language of the document instead of the GUI? I often work with documents in languages other than the interface language and would like to use Linguist to create wordlists for spellcheckers.
Language settings
Hi Erdal,
I don't think there is a setting for the language of the document as such. In fact, each paragraph of a document can have its own language setting. But there is a setting called "default language for documents". If I used this it would be possible to set this setting to the language on which you want to spellcheck and still have another language in the GUI. I will consider that next time I make a release.
Finn